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About This Book 


This book describes the PowerPC Architecture in three parts. Part 1, “PowerPC User Instruction Set Architecture” 
on page 1, describes the base instruction set and related facilities available to the application programmer. 
Part 2, “PowerPC Virtual Environment Architecture” on page 117, describes the storage model and related 
instructions and facilities available to the application programmer, and the Time Base as seen by the application 
programmer. Part 3, “PowerPC Operating Environment Architecture” on page 141, describes the system (privi¬ 
leged) instructions and related facilities. Each PowerPC Implementation Features document defines the imple¬ 
mentation dependent aspects of a particular implementation. The complete description of the PowerPC 
Architecture as instantiated in a given implementation includes also the material in the PowerPC Implementation 
Features document for that implementation. 
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Chapter 1. Introduction 


1.1 Overview 

This chapter describes computation modes, compat¬ 
ibility with the Power Architecture, document con¬ 
ventions, a processor overview, instruction formats, 
storage addressing, and instruction fetching. 


1.2 Computation Modes 

The PowerPC Architecture allows for the following 
types of implementation: 

■ 64-bit implementations, in which all registers 
except some Special Purpose Registers are 64 
bits long, and effective addresses are 64 bits 
long. All 64-bit implementations have two modes 
of operation: 64-bit mode and 32-bit mode. The 
mode controls how the effective address is inter¬ 
preted, how status bits are set, and how the 
Count Register is tested by Branch Conditional 
instructions. All instructions provided for 64-bit 
implementations are available in both modes. 

■ 32-bit implementations, in which all registers 
except Floating-Point Registers are 32 bits long, 
and effective addresses are 32 bits long. 

Instructions defined in this document are provided in 
both 64-bit implementations and 32-bit implementa¬ 
tions unless otherwise stated. Instructions that are 
provided only for 64-bit implementations are illegal in 
32-bit implementations, and vice versa. 

1.2.1 64-bit Implementations 

In both 64-bit mode and 32-bit mode of a 64-bit imple¬ 
mentation, instructions that set a 64-bit register affect 
all 64 bits, and the value placed into the register is 
independent of mode. In both modes, effective 
address computations use all 64 bits of the relevant 
registers (General Purpose Registers, Link Register, 
Count Register, etc.), and produce a 64-bit result. 
However, in 32-bit mode, the high-order 32 bits of the 
computed effective address are ignored when 


accessing data, and are set to 0 when fetching 
instructions. 

1.2.2 32-bit Implementations 

For a .32-bit implementation, all references to 64-bit 
mode in this document should be disregarded. The 
semantics of instructions are as shown in this docu¬ 
ment for 32-bit mode in a 64-bit implementation, 
except that in a 32-bit implementation all registers 
except Floating-Point Registers are 32 bits long. Bit 
numbers for registers are shown in braces ({ }) when 
they differ from the corresponding numbers for a 
64-bit implementation, as described in Section 1.5.1, 
“Definitions and Notation” on page 4. 

1.3 Instruction Mnemonics and 
Operands 

The description of each instruction includes the mne¬ 
monic and a formatted list of operands. Some exam¬ 
ples are the following. 

stw RS,D(RA) 

addis RT.RA.SI 

PowerPC-compliant assemblers will support the mne¬ 
monics and operand lists exactly as shown. They will 
also provide certain extended mnemonics, as 
described in Appendix C, “Assembler Extended 
Mnemonics” on page 223. 

1.4 Compatibility with the Power 
Architecture 

The PowerPC Architecture provides binary compat¬ 
ibility for Power application programs, except as 
described in Appendix G, “Incompatibilities with the 
Power Architecture” on page 257. 
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Many of the PowerPC instructions are identical to 
Power instructions. For some of these the PowerPC 
instruction name and/or mnemonic differs from that in 
Power. To assist readers familiar with the Power 
Architecture, Power mnemonics are shown with the 
individual instruction descriptions when they differ 
from the PowerPC mnemonics. Also, Appendix F, 
“Cross-Reference for Changed Power Mnemonics” on 
page 255, provides a cross-reference from Power 
mnemonics to PowerPC mnemonics for the 
instructions in this document. 


1.5 Document Conventions 


1.5.1 Definitions and Notation 

The following definitions and notation are used 
throughout the PowerPC Architecture documents. 

■ A program is a sequence of related instructions. 

■ Ouadwords are 128 bits, doublewords are 64 bits, 
words are 32 bits, halfwords are 16 bits, and 
bytes are 8 bits. 

■ All numbers are decimal unless specified in some 
special way. 

— Obnnnn means a number expressed in binary 
format. 

— Oxnnnn means a number expressed in 
hexadecimal format. 

Underscores may be used between digits. 

■ RT, RA, R1, ... refer to General Purpose Regis¬ 
ters. 

■ FRT, FRA, FR1, ... refer to Floating-Point Regis¬ 
ters. 

■ (x) means the contents of register x, where x is 
the name of an instruction field. For example, 
(RA) means the contents of register RA, and 
(FRA) means the contents of register FRA, where 
RA and FRA are instruction fields. Names such 
as LR and CTR denote registers, not fields, so 
parentheses are not used with them. Also, when 
register x is assigned to, parentheses are 
omitted. 

■ (RA|0) means the contents of register RA if the 
RA field has the value 1-31, or the value 0 if the 
RA field is 0. 

■ Bits in registers, instructions, and fields are spec¬ 
ified as follows. 

— Bits are numbered left to right, starting with 
bit 0. 

— Ranges of bits are specified by two numbers 
separated by a colon (:). The range p:q con¬ 
sists of bits p through q. 

— For registers that are 64 bits long in 64-bit 
implementations and 32 bits long in 32-bit 


implementations, bit numbers and ranges are 
specified with the values for 32-bit implemen¬ 
tations enclosed in braces ({ }). {} means a 
bit that does not exist in 32-bit implementa¬ 
tions. {:} means a range that does not exist 
in 32-bit implementations. 

■ X p means bit p of register/field X. 

X p { r ) means bit p of register/field X in a 64-bit 
implementation, and bit r of register/field X in a 
32-bit implementation. 

■ X p q means bits p through q of register/field X. 

X p q{r; S } rneans bits p through q of register/field X 
in a 64-bit implementation, and bits r through s of 
register/field X in a 32-bit implementation. 

■ X p p means bits p, q, ... of register/field X. 

X p q { r s j means bits p, q, ... of register/field X 
in a 64-bit implementation, and bits r, s, ... of 
register/field X in a 32-bit implementation. 

■ "’(RA) means the one's complement of the con¬ 
tents of register RA. 

■ Field i refers to bits 4xi to 4xi + 3 of a register. 

■ A period (.) as the last character of an instruction 
mnemonic means that the instruction records 
status information in certain fields of certain 
Special Purpose Registers as a side effect of exe¬ 
cution, as described in Chapter 2 through 
Chapter 4. 

■ The symbol || is used to describe the concat¬ 
enation of two values. For example, 010 || 111 is 
the same as 010111. 

■ x n means x raised to the n th power. 

■ n x means the replication of x, n times (i.e., x con¬ 
catenated to itself n—1 times). n 0 and n 1 are 
special cases: 

— n 0 means a field of n bits with each bit equal 
to 0. Thus 5 0 is equivalent to ObOOOOO. 

— n 1 means a field of n bits with each bit equal 
to 1. Thus 5 1 is equivalent to Obi 1111. 

■ Positive means greater than zero. 

■ Negative means less than zero. 

■ A system library program is a component of the 
system software that can be called by an applica¬ 
tion program using a Branch instruction. 

■ A system service program is a component of the 
system software that can be called by an applica¬ 
tion program using a System Call instruction. 

■ The system trap handler is a Component of the 
system software that receives control when the 
conditions specified in a Trap instruction are sat¬ 
isfied. 

■ The system error handler is a component of the 
system software that receives control when an 
error occurs. The system error handler includes 
a component for each of the various kinds of 
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error. These error-specific components are 
referred to as the system alignment error 
handler, the system data storage error handler, 
etc. 

■ Each bit and field in instructions, and in status 
and control registers (XER and FPSCR) and 
Special Purpose Registers, is either defined or 
reserved. 

■ /, //, ///, ... denotes a reserved field in an instruc¬ 
tion. 

■ Latency refers to the interval from the time an 
instruction begins execution until it produces a 
result that is available for use by a subsequent 
instruction. 

■ Unavailable refers to a resource that cannot be 
used by the program. Data or instruction storage 
is unavailable if an instruction is denied access to 
it. Floating-point instructions are unavailable if 
use of them is denied. See Part 3, “PowerPC 
Operating Environment Architecture” on 
page 141. 

1.5.2 Reserved Fields 

All reserved fields in instructions should be zero. If 
they are not, the instruction form is invalid: see 
Section 1.9.2, “Invalid Instruction Forms” on page 13. 

The handling of reserved bits in status and control 
registers (XER and FPSCR) and in Special Purpose 
Registers (and Segment Registers: see Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141) is implementation dependent. For each 
such reserved bit, an implementation shall.either: 

■ ignore the source value for the bit on write, and 
return zero for it on read; or 

■ set the bit from the source value on write, and 
return the value last set for it on read. 


— Programming Note - 

It is the responsibility of software to preserve bits 
that are now reserved in status and control regis¬ 
ters and in Special Purpose Registers (and 
Segment Registers: see Part 3, “PowerPC Oper¬ 
ating Environment Architecture” on page 141), as 
they may be assigned a meaning in some future 
version of the architecture or in Book IV, 
PowerPC Implementation Features for some 
implementation. In order to accomplish this pres¬ 
ervation in implementation independent fashion, 
software should do the following. 

■ Initialize each such register supplying zeros 
for all reserved bits. 

■ Alter (defined) bit(s) in the register by reading 
the register, altering only the desired bit(s), 
and then writing the new value back to the 
register. 

When a currently reserved bit is subsequently 
assigned a meaning, every effort will be made to 
have the value to which the system initializes the 
bit correspond to the “old behavior.” 


1.5.3 Description of Instruction 
Operation 

A formal description is given of the operation of each 
instruction. In addition, the operation of most 
instructions is described by a semiformal language at 
the register transfer level (RTL). This RTL uses the 
notation given below, in addition to the definitions and 
notation described in Section 1.5.1, “Definitions and 
Notation” on page 4. RTL notation not summarized 
here should be self-explanatory. 

The RTL descriptions do not imply any particular 
implementation. 

The RTL descriptions do not cover the following: 

■ “Standard” setting of the Condition Register, 
Fixed-Point Exception Register, and Floating-Point 
Status and Control Register. “Non-standard” 
setting of these registers (e.g., the setting of Con¬ 
dition Register Field 0 by the stwcx. instruction) 
is shown. 

■ Invalid instruction forms. 

Notation Meaning 

«- Assignment 

«-,ea Assignment of an instruction effec¬ 

tive address. In 32-bit mode of a 
64-bit implementation the high-order 
32 bits of the 64-bit target are set to 
0. 

-> NOT logical operator 

x Multiplication 

-7- Division (yielding quotient) 
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+ 


= , * 

<, <, >, > 

u u 


&, | 

©, = 

CElL(x) 

DOUBLE(x) 


EXTS(x) 

GPR(x)‘ 
MASK(x, y) 


MEM(x, y) 


ROTL 64 (x, y) 
ROTL 32 (x, y) 

SINGLE(x) 

SPREG(x) 

TRAP 

characterization 

undefined 


CIA 


NIA 


Two's-complement addition 
Two's-complement subtraction, unary 
minus 

Equals and Not Equals relations 
Signed comparison relations 
Unsigned comparison relations 
Unordered comparison relation 
AND, OR logical operators 
Exclusive-OR, Equivalence logical 
operators ((a=b) = (aQ-'b)) 

Least integer > x 

Result of converting x from floating¬ 
point single format to floating-point 
double format, using the model 
shown on page 99 

Result of extending x on the left with 
sign bits 

General Purpose Register x 
Mask having 1's in positions x 
through y (wrapping if x > y) and 0's 
elsewhere 

Contents of y bytes of memory 
starting at address x. In 32-bit mode 
of a 64-bit implementation the high- 
order 32 bits of the 64-bit value x are 
ignored. 

Result of rotating the 64-bit value x 
left y positions 

Result of rotating the 64-bit value x||x 
left y positions, where x is 32 bits 
long 

Result of converting x from floating¬ 
point double format to floating-point 
single format, using the model shown 
on page 102 

Special Purpose Register x 
Invoke the system trap handler 
Reference to the setting of status 
bits, in a standard way that is 
explained in the text 
An undefined value. The value may 
vary from one implementation to 
another, and from one execution to 
another on the same implementa¬ 
tion. 

Current Instruction Address, which is 
the 64{32}-bit address of the instruc¬ 
tion being described by a sequence 
of RTL. Used by relative branches 
to set the Next Instruction Address 
(NIA), and by Branch instructions 
with LK = 1 to set the Link Register. 
In 32-bit mode of 64-bit implementa¬ 
tions, the high-order 32 bits of CIA 
are always set to 0. Does not corre¬ 
spond to any architected register. 
Next Instruction Address, which is 
the 64{32}-bit address of the next 
instruction to be executed. For a 
successful branch, the next instruc¬ 
tion address is the branch target 
address: in RTL, this indicated by 


assigning a value to NIA. For other 
instructions that cause non¬ 
sequential instruction fetching (see 
Part 3, “PowerPC Operating Environ¬ 
ment Architecture” on page 141), 
the RTL is similar. For instructions 
that do not branch, and do not other¬ 
wise cause instruction fetching to be 
non-sequential, the next instruction 
address is CIA+ 4. In 32-bit mode of 
64-bit implementations, the high- 
order 32 bits of NIA are always set 
to 0. Does not correspond to any 
architected register. 

if ... then ... else ... Conditional execution, indenting 
shows range, else is optional 
do Do loop, indenting shows range. 'To" 

and/or 'by' clauses specify incre¬ 
menting an iteration variable, and 
'while' and/or 'until' clauses give 
termination conditions, in the usual 
manner. 

leave Leave innermost do loop, or do loop 

described in leave statement 

The precedence rules for RTL operators are summa¬ 
rized in Table 1. Operators higher in the table are 
applied before those lower in the table. Operators at 
the same level in the table associate from left to 
right, from right to left, or not at all, as shown. (For 
example, — associates from left to right, so a—b—c = 
(a—b)—c.) Parentheses are used to override the eval¬ 
uation order implied by the table, or to increase 
clarity: parenthesized expressions are evaluated 
before serving as operands. 


Table 1. Operator Precedence 

Operators 

Associativity 

subscript, function evaluation 

left to right 

pre-superscript (replication), 
post-superscript (exponentiation) 

right to left 

unary —, 

right to left 

X, T 

left to right 

+ , - 

left to right 

II 

left to right 

= , 9 fc , <, <, >, >, <:, >, ? 

left to right 

&, ®, = 

left to right 

1 

left to right 

: (range) 

none 

4— 

none 


1.6 Processor Overview 
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64-bit implementations 


32-bit implementations 


CR 


31 


LR 


63 


CTR 


Condition Register (page 17) 

Link Register (page 18) 

Count Register (page 18) 


CR 


LR 


CTR 


63 


63 


XER 


Fixed-Point Exception Register (page 27) 


XER 


31 


63 


FPSCR 


Floating-Point Status and 
31 Control Register (page 84) 


FPSCR 


31 


31 


31 


GPR 00 


GPR 00 

GPR01 


GPR 01 

... 

General Purpose Registers (page 27) 

... 

GPR 31 


GPR 31 


31 


31 


FPR 00 


FPR 00 

FPR01 


FPR 01 


Floating-point 
Registers (page 83) 


FPR 31 


FPR 31 


63 


31 


Figure 1. PowerPC User Register Set 


The processor implements the instruction set, the 
storage model, and other facilities defined in this doc¬ 
ument. Instructions which the processor can execute 
fall into the following classes. 

■ branch instructions, 

■ fixed-point instructions, and 

■ floating-point instructions. 

Branch instructions are described in Section 2.4, 
“Branch Processor Instructions” on page 19. Fixed- 
point instructions are described in Section 3.3, “Fixed- 
Point Processor Instructions” on page 29. 
Floating-point instructions are described in Section 
4.6, “Floating-Point Processor Instructions” on 
page 99. 

Fixed-point instructions operate on byte, halfword, 
word, and, in 64-bit implementations, doubleword 
operands. Floating-point instructions operate on 


single-precision and double-precision floating-point 
operands. The PowerPC Architecture uses 
instructions that are four bytes long and word-aligned. 
It provides for byte, halfword, word, and, in 64-bit 
implementations, doubleword operand fetches and 
stores between storage and a set of 32 General 
Purpose Registers (GPRs). It also provides for word 
and doubleword operand fetches and stores between 
storage and a set of 32 Floating-Point Registers 
(FPRs). 

There are no computational instructions that modify 
storage. To use a storage operand in a computation 
and then modify the same or another storage 
location, the content of storage must be loaded into a 
register, modified, and then stored back to the target 
location. Figure 2 on page 8 is a logical represen¬ 
tation of instruction processing. Figure 1 shows the 
registers of the PowerPC User Instruction Set Archi¬ 
tecture. 
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Branch 

Processing 


Fixed-Point and 

Floating-Point 

Instructions 


1 

’ 

' 

r 

Fixed-Pt 

Processing 


Float-Pt 

Processing 


Data to/from 
Storage 


Instructions 
from Storage 


Storage 


In some cases an instruction field is reserved, or 
must contain a particular value. If a reserved field 
does not have all bits set to 0, or if a field that must 
contain a particular value does not contain that value, 
the instruction form is invalid and the results are as 
described in Section 1.9.2, “Invalid Instruction Forms” 
on page 13. 

Split Field Notation 

In some cases an instruction field occupies more than 
one contiguous sequence of bits, or occupies one con¬ 
tiguous sequence of bits which are used in permuted 
order. Such a field is called a “split field.” In the 
format diagrams given below and in the individual 
instruction layouts, the name of a split field is shown 
in small letters, once for each of the contiguous 
sequences. In the RTL description of an instruction 
having a split field, and in certain other places where 
individual bits of a split field are identified, the name 
of the field in small letters represents the concat¬ 
enation of the sequences from left to right. In all 
other places, the name of the field is capitalized, and 
represents the concatenation of the sequences in 
some order, which need not be left to right, as 
described for each affected instruction. 


Figure 2. Logical Processing Model 

1.7 Instruction Formats 

All instructions are four bytes long and word-aligned. 
Thus, whenever instruction addresses are presented 
to the processor (as in Branch instructions) the two 
low order bits are ignored. Similarly, whenever the 
processor develops an instruction address its two low 
order bits are zero. 


1.7.1 1-Form 


0 

6 

30 

31 

OPCD 

LI 

R 

LK 

Figure 3. 

1 Instruction Format 




1.7.2 B-Form 


Bits 0:5 always specify the opcode (OPCD, below). 
Many instructions also have an extended opcode (XO, 
below). The remaining bits of the instruction contain 
one or more fields as shown below for the different 
instruction formats. 

The format diagrams given below show horizontally 
all valid combinations of instruction fields. The dia¬ 
grams include instruction fields that are used only by 
instructions defined in Part 2, “PowerPC Virtual Envi¬ 
ronment Architecture” on page 117, or in Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141. See those Books for the definitions of such 
fields. 


0 

6 

ii 

16 

30 

31 

OPCD 

BO 

Bl 

BD 

AA 

LK 


Figure 4. B Instruction Format 

1.7.3 SC-Form 


0 

6 

11 

16 

30 

31 

OPCD 

III 

III 

III 

XO 

2 ] 


Figure 5. SC instruction Format 
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1.7.4 D-Form 


1.7.6 X-Form 


0 6 11 16 31 


OPCD 

RT 

RA 

D 

OPCD 

RT 

RA 

SI 

OPCD 

RS 

RA 

D 

OPCD 

RS 

RA 

Ul 

OPCD 

BF 

B 

I! 

RA 

SI 

OPCD 

BF 

B 

1! 

RA 

Ul 

OPCD 

TO 

RA 

SI 

OPCD 

FRT 

RA 

D 

OPCD 

FRS 

RA 

D 


Figure 6. D Instruction Format 


1.7.5 DS-Form 


0 6 11 16 30 31 


OPCD 

RT 

RA 

DS 

XO 

OPCD 

RS 

RA 

DS 

XO 


Figure 7. DS Instruction Format (64-bit implementa¬ 
tions only) 


0 6 11 16 21 31 


OPCD 

RT 

RA 

RB 

XO 

/ 

OPCD 

RT 

RA 

NB 

XO 

/ 

OPCD 

RT 

B 

SR 

III 

XO 

/ 

OPCD 

RT 

m 

RB 

XO 

/ 

OPCD 

RT 

m 

III 

XO 

/ 

OPCD 

RS 

RA 

RB 

XO 


OPCD 

RS 

RA 

RB 

XO 

1 

OPCD 

RS 

RA 

RB 

XO 

/ 

OPCD 

RS 

RA 

NB 

XO 

I 

OPCD 

RS 

RA 

SH 

XO 

E9 

OPCD 

RS 

RA 

III 

XO 

E3 

OPCD 

RS 

B 

SR 

III 

XO 

/ 

OPCD 

RS 

III 

RB 

XO 

/ 

OPCD 

RS 

III 

III 

XO 

/ 

OPCD 

BF 

BE 

RA 

RB 

XO 

/ 

OPCD 

BF 

□ 

FRA 

FRB 

XO 

/ 

OPCD 

BF 

// 

BFA // 

III 

XO 

/ 

OPCD 

BF 

// 

III 

U / 

XO 

Rc 

OPCD 

BF 

// 

III 

III 

XO 

/ 

OPCD 

TO 

RA 

RB 

XO 

/ 

OPCD 

FRT 

RA 

RB 

XO 

/ 

OPCD 

FRT 

III 

FRB 

XO 

Rc 

OPCD 

FRT 

III 

III 

XO 

Rc 

OPCD 

FRS 

RA 

RB 

XO 

/ 

OPCD 

BT 

III 

III 

XO 

Rc 

OPCD 

III 

RA 

RB 

XO 

/ 

OPCD 

III 

III 

RB 

XO 

/ 

OPCD 

III 

III 

III 

XO 

/ 


Figure 8. X Instruction Format 


1.7.7 XL-Form 


0 6 11 16 21 31 


OPCD 

BT 

BA 

BB 

XO 

/ 

OPCD 

BO 

Bl 

III 

XO 

LK 

OPCD 

BF // 

BFA // 

III 

XO 

/ 

OPCD 

III 

III 

III 

XO 

/ 


Figure 9. XL Instruction Format 
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1.7.8 XFX-Form 


0 6 11 21 31 


OPCD 

RT 

spr 

XO 

/ 

OPCD 

RT 

tbr 

XO 

/ 

OPCD 

RT 

/ FXM / 

XO 

/ 

OPCD 

RS 

spr 

XO 

/ 


Figure 10. XFX Instruction Format 

1.7.9 XFL-Form 


0 

6 7 

15 16 

21 

31 

OPCD 

/ 

FLM 

0 

FRB 

XO 

Rc 


Figure 11. XFL instruction Format 

1.7.10 XS-Form 


0 

6 

ii 

16 

21 

30 

31 

OPCD 

RS 

RA 

sh 

XO 

H 

s 


Figure 12. XS Instruction Format (64-bit implementa¬ 
tions only) 


0 

6 

ii. 

16 

21 

26 

31 

OPCD 

RS 

RA 

RB 

MB 

ME 

Rc 

OPCD 

RS 

RA 

SH 

MB 

ME 

Rc 


Figure 15. M Instruction Format 

1.7.14 MD-Form 


0 

6 

ii 

16 

21 

27 

30 31 

OPCD 

RS 

RA 

sh 

mb 

XO 

sh 

Rc 

OPCD 

RS 

RA 

sh 

me 

XO 

sh 

Rc 


Figure 16. MD Instruction Format (64-bit implementa¬ 
tions only) 

1.7.15 MDS-Form 


0 

6 

ii 

16 

21 

27 

31 

OPCD 

RS 

RA 

RB 

mb 

XO 

Rc 

OPCD 

RS 

RA 

RB 

me 

XO 

Rc 


Figure 17. MDS Instruction Format (64-bit implemen¬ 
tations only) 


1.7.16 Instruction Fields 


1.7.11 XO-Form 


0 

6 

ii 

16 

21 

22 

31 

OPCD 

RT 

RA 

RB 

OE 

XO 

Rc 

OPCD 

RT 

RA 

RB 

/ 

XO 

Rc 

OPCD 

RT 

RA 

III 

OE 

XO 

Rc 


Figure 13. XO Instruction Format 


1.7.12 A-Form 


0 6 11 16 21 26 31 


OPCD 

FRT 

FRA 

FRB 

FRC 

XO 

Rc 

OPCD 

FRT 

FRA 

FRB 

III 

XO 

Rc 

OPCD 

FRT 

FRA 

III 

FRC 

XO 

Rc 

OPCD 

FRT 

III 

FRB 

III 

XO 

Rc 


Figure 14. A Instruction Format 


1.7.13 M-Form 


AA (30) 

Absolute Address bit 

0 The immediate field represents an address 
relative to the current instruction address. 
For l-form branches the effective address of 
the branch target is the sum of the LI field 
sign-extended to 64 bits and the address of 
the branch instruction. For B-form branches 
the effective address of the branch target is 
the sum of the BD field sign-extended to 64 
bits and the address of the branch instruc¬ 
tion. 

1 The immediate field represents an absolute 
address. For l-form branches the effective 
address of the branch target is the LI field 
sign-extended to 64 bits. For B-form 
branches the effective address of the branch 
target is the BD field sign-extended to 64 
bits. 

BA (11:15) 

Field used to specify a bit in the CR to be used as 

a source. 

BB (16:20) 

Field used to specify a bit in the CR to be used as 

a source. 
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BD (16:29) 

Immediate field specifying a 14-bit signed two's 
complement branch displacement which is con¬ 
catenated on the right with ObOO and sign- 
extended to 64 bits. 

BF (6:8) 

Field used to specify one of the CR fields or one 
of the FPSCR fields as a target. 

BFA (11:13) 

Field used to specify one of the CR fields or one 
of the FPSCR fields as a source. 

Bl (11:15) 

Field used to specify a bit in the CR to be used as 
the condition of a Branch Conditional instruction. 

BO (6:10) 

Field used to specify options for the Branch Con¬ 
ditional instructions. The encoding is described in 
Section 2.4, “Branch Processor Instructions” on 
page 19. 

BT (6:10) 

Field used to specify a bit in the CR or in the 
FPSCR as the target of the result of an instruc¬ 
tion. 

D (16:31) 

Immediate field specifying a 16-bit signed two's 
complement integer which is sign-extended to 64 
bits. 

DS (16:29) 

Immediate field specifying a 14-bit signed two's 
complement integer which is concatenated on the 
right with ObOO and sign-extended to 64 bits. This 
field is defined in 64-bit implementations only. 

FLM (7:14) 

Field mask used to identify the FPSCR fields that 
are to be updated by the mtfsf instruction. 

FRA (11:15) 

Field used to specify an FPR as a source of an 
operation. 

FRB (16:20) 

Field used to specify an FPR as a source of an 
operation. 

FRC (21:25) 

Field used to specify an FPR as a source of an 
operation. 

FRS (6:10) 

Field used to specify an FPR as a source of an 
operation. 

FRT (6:10) 

Field used to specify an FPR as the target of an 
operation. 

FXM (12:19) 

Field mask used to identify the CR fields that are 
to be updated by the mtcrf instruction. 


L (10) 

Field used to specify whether a Fixed-Point 
Compare instruction is to compare 64-bit 
numbers or 32-bit numbers. This field is defined 
in 64-bit implementations only. 

LI (6:29) 

Immediate field specifying a 24-bit signed two's 
complement integer which is concatenated on the 
right with ObOO and sign-extended to 64 bits. 

LK (31) 

LINK bit. 

0 Do not set the Link Register. 

1 Set the Link Register. If the instruction is a 
Branch instruction, the address of the 
instruction following the Branch instruction is 
placed into the Link Register. 

MB (21:25) and ME (26:30) 

Fields used in M-form instructions to specify a 
64-bit mask consisting of 1 -bits from bit MB+ 32 
through bit ME + 32 inclusive, and 0-bits else¬ 
where, as described in Section 3.3.13, “Fixed- 
Point Rotate and Shift Instructions” on page 69. 

MB (21:26) 

Field used in MD-form and MDS-form instructions 
to specify the first 1-bit of a 64-bit mask, as 
described in Section 3.3.13, “Fixed-Point Rotate 
and Shift Instructions” on page 69. This field is 
defined in 64-bit implementations only. 

ME (21:26) 

Field used in MD-form and MDS-form instructions 
to specify the last 1-bit of a 64-bit mask, as 
described in Section 3.3.13, “Fixed-Point Rotate 
and Shift Instructions” on page 69. This field is 
defined in 64-bit implementations only. 

NB (16:20) 

Field used to specify the number of bytes to 
move in an immediate string load or store. 

OPCD (0:5) 

Primary opcode field. 

OE (21) 

Used for extended arithmetic to enable setting 
OV and SO in the XER. 

RA (11:15) 

Field used to specify a GPR to be used as a 
source or as a target. 

RB (16:20) 

Field used to specify a GPR to be used as a 
source. 

Rc (31) 

RECORD bit 

0 Do not set the Condition Register. 

1 Set the Condition Register to reflect the 
result of the operation. 

For fixed-point instructions, CR bits 0:3 are 
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set to reflect the result as a signed quantity. 
The result as an unsigned quantity or a bit 
string can be deduced from the EO bit. 

For floating-point instructions, CR bits 4:7 
are set to reflect Floating-Point Exception, 
Floating-Point Enabled Exception, Floating- 
Point Invalid Operation Exception, and 
Floating-Point Overflow Exception. 

RS (6:10) 

Field used to specify a GPR to be used as a 
source. 

RT (6:10) 

Field used to specify a GPR to be used as a 
target. 

SH (16:20, or 16:20 and 30) 

Field used to specify a shift amount. Location 
16:20 and 30 pertains to 64-bit implementations 
only. 

SI (16:31) 

Immediate field used to specify a 16-bit signed 
integer. 

SPR (11:20) 

Field used to specify a Special Purpose Register 
for the mtspr and mfspr instructions. The 
encoding is described in Section 3.3.14, “Move 
To/From System Register instructions” on 
page 79. 

SR (12:15) 

See Part 3, “PowerPC Operating Environment 
Architecture” on page 141. 

TBR (11:20) 

See Part 2, “PowerPC Virtual Environment 
Architecture” on page 117. 

TO (6:10) 

Field used to specify the conditions on which to 
trap. The encoding is described in Section 3.3.11, 
“Fixed-Point Trap Instructions” on page 61. 

U (16:19) 

Immediate field used as the data to be placed 
into a field in the FPSCR. 

Ul (16:31) 

Immediate field used to specify a 16-bit unsigned 
integer. 

XO (21:29, 21:30, 22:30, 26:30, 27:29, 27:30, 30, or 
30:31) 

Extended opcode field. Locations 21:29, 27:29, 
27:30, and 30:31 pertain to 64-bit implementations 
only. 


1.8 Classes of Instructions 

An instruction falls into exactly one of the following 
three classes: 

Defined 

Illegal 

Reserved 

The class is determined by examining the opcode, and 
the extended opcode if any. If the opcode, or combi¬ 
nation of opcode and extended opcode, is not that of 
a defined instruction nor of a reserved instruction, the 
instruction is illegal. 

Some instructions are defined only for 64-bit imple¬ 
mentations and a few are defined only for 32-bit 
implementations (see 1.8.2, “Illegal Instruction Class” 
on page 13). With the exception of these, a given 
instruction is in the same class for all implementa¬ 
tions of the PowerPC Architecture. In future versions 
of this architecture, instructions that are now illegal 
may become defined (by being added to the architec¬ 
ture) or reserved (by being assigned to one of the 
special purposes described in Appendix J, “Reserved 
Instructions” on page 265). Similarly, instructions 
that are now reserved may become defined. 

The results of attempting to execute a given instruc¬ 
tion are said to be boundedly undefined if they could 
have been achieved by executing an arbitrary 
sequence of defined instructions, in valid form (see 
below), starting in the state the machine was in 
before attempting to execute the given instruction. 
Boundedly undefined results for a given instruction 
may vary between implementations, and between 
execution attempts in the same implementation, and 
are not further defined in this document. 


1.8.1 Defined Instruction Class 

This class of instructions contains all the instructions 
defined in the PowerPC User Instruction Set Architec¬ 
ture, PowerPC Virtual Environment Architecture, and 
PowerPC Operating Environment Architecture. 

Defined instructions are guaranteed to be supported 
in all implementations, except as stated in the instruc¬ 
tion descriptions. (The exceptions are instructions 
that are supported only in 64-bit implementations or 
only in 32-bit implementations.) 

A defined instruction can have preferred and/or 
invalid forms, as described in Section 1.9.1, “Pre¬ 
ferred Instruction Forms" on page 13, and Section 
1.9.2, “Invalid Instruction Forms" on page 13. 
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1.8.2 Illegal Instruction Class 

This class of instructions contains the set of 
instructions described in Appendix I, “Illegal 
Instructions” on page 263. For 64-bit implementa¬ 
tions this class includes all instructions that are 
defined only for 32-bit implementations. For 32-bit 
implementations it includes all instructions that are 
defined only for 64-bit implementations. 

Excluding instructions that are defined for one type of 
implementation but not the other, illegal instructions 
are available for future extensions of the PowerPC 
Architecture: that is, some future version of the 

PowerPC Architecture may define any of these 
instructions to perform new functions. 

Any attempt to execute an illegal instruction will 
cause the system illegal instruction error handler to 
be invoked and will have no other effect. 

An instruction consisting entirely of binary 0's is guar¬ 
anteed always to be an illegal instruction. This 
increases the probability that an attempt to execute 
data or uninitialized storage will result in the invoca¬ 
tion of the system illegal instruction error handler. 

1.8.3 Reserved Instruction Class 

This class of instructions contains the set of 
instructions described in Appendix J, “Reserved 
Instructions” on page 265. 

Reserved instructions are allocated to specific pur¬ 
poses that are outside the scope of the PowerPC 
Architecture. 

Any attempt to execute a reserved instruction will 
either cause the system illegal instruction error 
handler to be invoked or will yield boundedly unde¬ 
fined results. 

1.9 Forms of Defined 
Instructions 

1.9.1 Preferred Instruction Forms 

Some of the defined instructions have preferred 
forms. For such an instruction, the preferred form will 
execute in an efficient manner, but any other form 
may take significantly longer to execute than the pre¬ 
ferred form. 

Instructions having preferred forms are: 

■ the Load/Store Multiple instructions 

■ the Load/Store String instructions 


■ the Or Immediate instruction (preferred form of 
no-op) 

1.9.2 Invalid Instruction Forms 

Some of the defined instructions have invalid forms. 
An instruction form is invalid if one or more fields of 
the instruction, excluding the opcode field(s), are 
coded incorrectly. 

Any attempt to execute an invalid form of an instruc¬ 
tion will either cause the system illegal instruction 
error handler to be invoked or will yield boundedly 
undefined results. Exceptions to this rule are stated 
in the instruction descriptions. 

Some kinds of invalid form can be deduced from the 
instruction layout. These are listed below. 

■ Field shown as '/'(s) but coded as non-zero. 

■ Field shown as containing a particular value but 
coded as some other value. 

These invalid forms are not discussed further. 

Instructions having invalid forms that cannot be so 
deduced are listed below. For these, the invalid 
forms are identified in the instruction descriptions. 

■ the Branch Conditional instructions 

■ the Load/Store with Update instructions 

■ the Load Multiple instructions 

■ the Load String instructions 

■ the Fixed-Point Compare instructions (invalid 
form exists only in 32-bit implementations) 

■ Move To/From Special Purpose Register ( mtspr , 
mfspr) 

■ the Load/Store Floating-Point with Update 
instructions 

- Assembler Note - 

To the extent possible, the Assembler should 
report uses of invalid instruction forms as errors. 


1.9.3 Optional Instructions 

Some of the defined instructions are optional. The 
optional instructions are defined in Appendix A, 
“Optional Instructions” on page 207, and also in the 
section entitled “Lookaside Buffer Management 
Instructions (Optional)” and the appendix entitled 
“Optional Facilities and Instructions” of Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141. 

Any attempt to execute an optional instruction that is 
not provided by the implementation will cause the 
system illegal instruction error handler to be invoked. 


Chapter 1. Introduction 13 




Exceptions to this rule are stated in the instruction 
descriptions. 


1.10 Exceptions 

There are two kinds of exception, those caused 
directly by the execution of an instruction and those 
caused by an asynchronous event. In either case, the 
exception may cause one of several components of 
the system software to be invoked. 

The exceptions that can be caused directly by the 
execution of an instruction include the following. 

■ an attempt to execute an illegal instruction, or an 
attempt by an application program to execute a 
“privileged” instruction (see Part 3, “PowerPC 
Operating Environment Architecture” on 
page 141) (system illegal instruction error 
handler or system privileged instruction error 
handler) 

■ the execution of a defined instruction using an 
invalid form (system illegal instruction error 
handler or system privileged instruction error 
handler) 

■ the execution of an optional instruction that is not 
provided by the implementation (system illegal 
instruction error handler) 

■ an attempt to access a storage location that is 
unavailable (system error handler) 

■ an attempt to access storage with an effective 
address alignment that is invalid for the instruc¬ 
tion (system alignment error handler) 

■ the execution of a System Call instruction 
(system service program) 

■ the execution of a Trap instruction that traps 
(system trap handler) 

■ the execution of a floating-point instruction when 
floating-point instructions are unavailable (system 
floating-point unavailable error handler) 

■ the execution of a floating-point instruction that 
causes a floating-point exception that is enabled 
(system floating-point enabled exception error 
handler) 

■ the execution of a floating-point instruction that 
requires system software assistance (system 
floating-point assist error handler; the conditions 
under which such software assistance is required 
are implementation-dependent) 

The exceptions that can be caused by an asynchro¬ 
nous event are described in Part 3, “PowerPC Oper¬ 
ating Environment Architecture” on page 141. 


The invocation of the system error handler is precise, 
except that if one of the imprecise modes for invoking 
the system floating-point enabled exception error 
handler is in effect (see page 92) then the invocation 
of the system floating-point enabled exception error 
handler may be imprecise. When the system error 
handler is invoked imprecisely, the excepting instruc¬ 
tion does not appear to complete before the next 
instruction starts (because one of the effects of the 
excepting instruction, namely the invocation of the 
system error handler, has not yet occurred). 

Additional information about exception handling can 
be found in Part 3, “PowerPC Operating Environment 
Architecture” on page 141. 


1.11 Storage Addressing 

A program references storage using the effective 
address computed by the processor when it executes 
a Storage Access or Branch instruction (or certain 
other instructions described in Part 2, “PowerPC 
Virtual Environment Architecture” on page 117, and 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141), or when it fetches the 
next sequential instruction. 

1.11.1 Storage Operands 

Bytes in storage are numbered consecutively starting 
with 0. Each number is the address of the corre¬ 
sponding byte. 

Storage operands may be bytes, halfwords, words, or 
doublewords, or, for the Load/Store Multiple and 
Move Assist instructions, a sequence of bytes or 
words. The address of a storage operand is the 
address of its first byte (i.e., of its lowest-numbered 
byte). Byte ordering is Big-Endian by default, but 
PowerPC can be operated in a mode in which byte 
ordering is Little-Endian. See Appendix D, “Little- 
Endian Byte Ordering” on page 233. 

Operand length is implicit for each instruction. 

The operand of a single-register Storage Access 
instruction has a “natural” alignment boundary equal 
to the operand length. In other words, the “natural” 
address of an operand is an integral multiple of the 
operand length. A storage operand is said to be 
“aligned” if it is aligned at its natural boundary: other¬ 
wise it is said to be “unaligned.” 

Storage operands for single-register Storage Access 
instructions have the following characteristics. 
(Although not permitted as storage operands, 
quadwords are shown because quadword alignment is 
desirable for certain storage operands.) 
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Operand 

Length 

Addr 60 . 63 if aligned 

Byte 

8 bits 

xxxx 

Halfword 

2 bytes 

xxxO 

Word 

4 bytes 

xxOO 

Doubleword 

8 bytes 

xOOO 

Quadword 

16 bytes 

0000 

Note: An “x” in an address bit position indicates 

that the bit can be 0 or 1 independent of the state of 

other bits in the address. 



The concept of alignment is also applied more gener¬ 
ally, to any datum in storage. For example, a 12-byte 
datum in storage is said to be word-aligned if its 
address is an integral multiple of 4. 

Some instructions require their storage operands to 
have certain alignments. In addition, alignment may 
affect performance. For single-register Storage 
Access instructions the best performance is obtained 
when storage operands are aligned. Additional 
effects of data placement on performance are 
described in Part 2, “PowerPC Virtual Environment 
Architecture” on page 117. 

Instructions are always four bytes long and word- 
aligned. 

1.11.2 Effective Address Calculation 

The 64- or 32-bit address computed by the processor 
when executing a Storage Access or Branch instruc¬ 
tion (or certain other instructions described in Part 2, 
“PowerPC Virtual Environment Architecture” on 
page 117, and Part 3, “PowerPC Operating Environ¬ 
ment Architecture” on page 141), or when fetching 
the next sequential instruction, is called the “effective 
address,” and specifies a byte in storage. For a 
Storage Access instruction, if the sum of the effective 
address and the operand length exceeds the 
maximum effective address, the storage operand is 
considered to wrap around from the maximum effec¬ 
tive address to effective address 0, as described 
below. 

Effective address computations, for both data and 
instruction accesses, use 64{32}-bit unsigned binary 
arithmetic regardless of mode. A carry from bit 0 is 
ignored. In a 64-bit implementation, the 64-bit current 
instruction address and next instruction address are 
not affected by a change from 32-bit mode to 64-bit 
mode, but they are affected by a change from 64-bit 
mode to 32-bit mode (the high-order 32 bits are set to 
0 ). 

In 64-bit mode, the entire 64-bit result comprises the 
64-bit effective address. The effective address arith¬ 
metic wraps around from the maximum address, 
2 64 —1, to address 0. 


In 32-bit mode, the low-order 32 bits of the 64-bit 
result comprise the effective address for the purpose 
of addressing storage. The high-order 32 bits of the 
64-bit effective address are ignored for the purpose of 
accessing data, but are included whenever a 64-bit 
effective address is placed into a GPR by Load with 
Update and Store with Update instructions. The high- 
order 32 bits of the 64-bit effective address are set to 
0 for the purpose of fetching instructions, and when¬ 
ever a 64-bit effective address is placed into the Link 
Register by Branch instructions having LK = 1. The 
high-order 32 bits of the 64-bit effective address are 
set to 0 in Special Purpose Registers when the 
system error handler is invoked. As used to address 
storage, the effective address arithmetic appears to 
wrap around from the maximum address, 2 32 —1, to 
address 0. 

A zero in the RA field indicates the absence of the 
corresponding address component. For the absent 
component, a value of zero is used for the address. 
This is shown in the instruction descriptions as (RA|0). 

In both 64-bit and 32-bit modes, the calculated Effec¬ 
tive Address may be modified in its three low-order 
bits before accessing storage if the PowerPC system 
is operating in Little-Endian mode. See Appendix D, 
“Little-Endian Byte Ordering” on page 233. 

Effective addresses are computed as follows. In the 
descriptions below, it should be understood that “the 
contents of a GPR” refers to the entire 64-bit con¬ 
tents, independent of mode, but that in 32-bit mode, 
only bits 32:63 of the 64-bit result of the computation 
are used to address storage. 

■ With X-form instructions, in computing the effec¬ 
tive address of a data element, the contents of 
the GPR designated by RB is added to the con¬ 
tents of the GPR designated by RA or to zero if 
RA = 0. 

■ With D-form instructions, the 16-bit D field is sign- 
extended to form a 64-bit address component. In 
computing the effective address of a data 
element, this address component is added to the 
contents of the GPR designated by RA or to zero 
if RA = 0. 

■ With DS-form instructions, the 14-bit DS field is 
concatenated on the right with ObOO and sign- 
extended to form a 64-bit address component. In 
computing the effective address of a data 
element, this address component is added to the 
contents of the GPR designated by RA or to zero 
if RA = 0. 

■ With l-form Branch instructions, the 24-bit LI field 
is concatenated on the right with ObOO and sign- 
extended to form a 64-bit address component. If 
AA = 0, this address component is added to the 
address of the branch instruction to form the 
effective address of the next instruction. If 
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AA=1, this address component is the effective 
address of the next instruction. 

■ With B-form Branch instructions, the 14-bit BD 
field is concatenated on the right with ObOO and 
sign-extended to form a 64-bit address compo¬ 
nent. If AA = 0, this address component is added 
to the address of the branch instruction to form 
the effective address of the next instruction. If 


AA = 1, this address component is the effective 
address of the next instruction. 

■ With XL-form Branch instructions, bits 0:61 of the 
Link Register or the Count Register are concat¬ 
enated on the right with ObOO to form the effec¬ 
tive address of the next instruction. 

■ With sequential instruction fetching, the value 4 is 
added to the address of the current instruction to 
form the effective address of the next instruction. 
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Chapter 2. Branch Processor 


2.1 Branch Processor Overview 

This chapter describes the registers and instructions 
that make up the Branch Processor facilities. Section 
2.3, “Branch Processor Registers” on page 17 
describes the registers associated with the Branch 
Processor. Section 2.4, “Branch Processor 
Instructions” on page 19 describes the instructions 
associated with the Branch Processor. 


2.2 Instruction Fetching 

In general, instructions appear to execute sequen¬ 
tially, in the order in which they appear in storage. 
The exceptions to this rule are listed below. 

■ Branch instructions for which the branch is taken 
cause execution to continue at the target address 
generated by the Branch instruction. 

■ Trap and System Call instructions cause the 
appropriate system handler to be invoked. 

■ Exceptions can cause the system error handler to 
be invoked, as described in Section 1.10, 
“Exceptions” on page 14. 

■ The Return From Interrupt instruction, described 
in “Return From Interrupt XL-form” on page 150, 
causes execution to continue at the address con¬ 
tained in a Special Purpose Register. 

In general, each instruction appears to complete 
before the next instruction starts. The only excep¬ 
tions to this rule arise when the system error handler 
is invoked imprecisely, as described in Section 1.10, 
“Exceptions” on page 14, or when certain special reg¬ 
isters are altered, as described in the appendix enti¬ 
tled “Synchronization Requirements for Special 
Registers” in Appendix L, “Synchronization Require¬ 
ments for Special Registers” on page 275. None of 
these special registers can be altered by an applica¬ 
tion program. 


— Programming Note - 

CAUTION 

Implementations are allowed to prefetch any 
number of instructions before the instructions are 
actually executed. If a program modifies the 
instructions it intends to execute, it should call a 
system library program to ensure that the modifi¬ 
cations have been made visible to the instruction 
fetching mechanism prior to attempting to execute 
the modified instructions. 


2.3 Branch Processor Registers 

2.3.1 Condition Register 

The Condition Register (CR) is a 32-bit register which 
reflects the result of certain operations, and provides 
a mechanism for testing (and branching). 


CR 

0 31 

Figure 18. Condition Register 

The bits in the Condition Register are grouped into 
eight 4-bit fields, named CR Field 0 (CRO), ..., CR Field 
7 (CR7), which are set in one of the following ways: 

■ Specified fields of the CR can be set by a move 
to the CR from a GPR (mtcrf). 

■ A specified field of the CR can be set by a move 
to the CR from another CR field ( mcrf ), from the 
XER (mcrxr), or from the FPSCR (mcrfs). 

■ CR Field 0 can be set as the implicit result of a 
fixed-point operation. 

■ CR Field 1 can be set as the implicit result of a 
floating-point operation. 

■ A specified CR field can be set as the result of 
either a fixed-point or a floating-point Compare 
instruction. 

Instructions are provided to perform logical oper¬ 
ations on individual CR bits, and to test individual CR 
bits. 
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When Rc = 1 in most fixed-point instructions, the first 
three bits of CR Field 0 (bits 0:2 of the Condition Reg¬ 
ister) are set by an algebraic comparison of the result 
(the low-order 32 bits of the result in 32-bit mode) to 
zero, and the fourth bit of CR Field 0 (bit 3 of the Con¬ 
dition Register) is copied from the SO field of the 
XER. addic., and]., and andis. set these four bits 
implicitly. These bits are interpreted as follows. As 
used below, “result” refers to the entire 64-bit value 
placed into the target register in 64-bit mode, and to 
bits 32:63 of the 64-bit value placed into the target 
register in 32-bit mode. If any portion of the result is 
undefined, then the value placed into the first three 
bits of CR Field 0 is undefined. 

Bit Description 

0 Negative (LT) 

The result is negative. 

1 Positive (GT) 

The result is positive. 

2 Zero (EQ) 

The result is zero. 

3 Summary Overflow (SO) 

This is a copy of the final state of XER so at the 
completion of the instruction. 

- Programming Note - 

CR Field 0 may not reflect the “true” (infinitely 
precise) result if overflow occurs: see Section 
3.3.9, “Fixed-Point Arithmetic Instructions” on 
page 50. 


When Rc=1 in all floating-point instructions, CR Field 
1 (bits 4:7 of the Condition Register) is set to the 
Floating-Point exception status, copied from bits 0:3 of 
the Floating-Point Status and Control Register. These 
bits are interpreted as follows. 

Bit Description 

4 Floating-Point Exception (FX) 

This is a copy of the final state of FPSCRp* at the 
completion of the instruction. 

5 Floating-Point Enabled Exception (FEX) 

This is a copy of the final state of FPSCR fex at 
the completion of the instruction. 

6 Floating-Point Invalid Operation Exception (VX) 
This is a copy of the final state of FPSCR VX at the 
completion of the instruction. 

7 Floating-Point Overflow Exception (OX) 

This is a copy of the final state of FPSCR 0X at 
the completion of the instruction. 

When a specified CR field is set by a Compare 
instruction, the bits of the specified field are inter¬ 
preted as follows. 


Bit Description 

0 Less Than, Floating-Point Less Than (LT, FL) 

For fixed-point Compare instructions, (RA) < SI, 
Ul, or (RB) (algebraic comparison) or (RA) <: SI, 
Ul, or (RB) (logical comparison). For floating¬ 
point Compare instructions, (FRA) < (FRB). 

1 Greater Than, Floating-Point Greater Than (GT, 
FG) 

For fixed-point Compare instructions, (RA) > SI, 
Ul, or (RB) (algebraic comparison) or (RA) :> SI, 
Ul, or (RB) (logical comparison). For floating¬ 
point Compare instructions, (FRA) > (FRB). 

2 Equal, Floating-Point Equal (EQ, FE) 

For fixed-point Compare instructions, (RA) = SI, 
Ul, or (RB). For floating-point Compare 
instructions, (FRA) = (FRB). 

3 Summary Overflow, Floating-Point Unordered 
(SO, FU) 

For fixed-point Compare instructions, this is a 
copy of the final state of XER S0 at the completion 
of the instruction. For floating-point Compare 
instructions, one or both of (FRA) and (FRB) is a 
NaN. 


2.3.2 Link Register 

The Link Register (LR) is a 64-bit register. It can be 
used to provide the branch target address for the 
Branch Conditional to Link Register instruction, and it 
holds the return address after Branch and Link 
instructions. 


LR 

0 63 

Figure 19. Link Register 

2.3.3 Count Register 

The Count Register (CTR) is a 64-bit register. It can 
be used to hold a loop count that can be decremented 
during execution of Branch instructions that contain 
an appropriately coded BO field. If the value in the 
Count Register is 0 before being decremented, it is 
—1 afterward. The Count Register can also be used 
to provide the branch target address for the Branch 
Conditional to Count Register instruction. 


CTR 

0 63 

Figure 20. Count Register 
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2.4 Branch Processor Instructions 


2.4.1 Branch Instructions 


The sequence of instruction execution can be changed 
by the Branch instructions. Because ail instructions 
are on word boundaries, bits 62 and 63 of the gener¬ 
ated branch target address are ignored by the 
processor in performing the branch. 

The Branch instructions compute the effective 
address (EA) of the target in one of the following four 
ways, as described in Section 1.11.2, “Effective 
Address Calculation” on page 15. 

1. Adding a displacement to the address of the 
branch instruction (Branch or Branch Conditional 
with AA = 0). 

2. Specifying an absolute address (Branch or 
Branch Conditional with AA = 1). 

3. Using the address contained in the Link Register 
(Branch Conditional to Link Register). 

4. Using the address contained in the Count Reg¬ 
ister (Branch Conditional to Count Register). 

In all four cases, in 32-bit mode of 64-bit implementa¬ 
tions, the final step in the address computation is 
setting the high-order 32 bits of the target address to 
0 . 

For the first two methods, the target addresses can 
be computed sufficiently ahead of the branch instruc¬ 
tion that instructions can be prefetched along the 
target path. For the third and fourth methods, pre¬ 
fetching instructions along the target path is also pos¬ 
sible provided the Link Register or the Count Register 
is loaded sufficiently ahead of the branch instruction. 

Branching can be conditional or unconditional, and 
the return address can optionally be provided. If the 
return address is to be provided (LK = 1), the effective 
address of the instruction following the branch 
instruction is placed into the Link Register after the 
branch target address has been computed: this is 
done whether or not the branch is taken. 

In Branch Conditional instructions, the BO field speci¬ 
fies the conditions under which the branch is taken. 
The first four bits of the BO field specify how the 
branch is affected by or affects the Condition Register 
and the Count Register. The fifth bit, shown below as 
having the value “y,” may be used by some imple¬ 
mentations as described below. 

The encoding for the BO field is as follows. Here 
M = 32 in 32-bit mode and M = 0 in 64-bit mode. If the 
BO field specifies that the CTR is to be decremented, 
the entire 64-bit CTR is decremented regardless of 
the mode. 


BO Description 

OOOOy Decrement the CTR, then branch if the decre¬ 
mented CTR M63 #0 and the condition is 

FALSE. 

0001y Decrement the CTR, then branch if the decre¬ 
mented CTR M63 =0 and the condition is 

FALSE. 

001 zy Branch if the condition is FALSE. 

OlOOy Decrement the CTR, then branch if the decre¬ 
mented CTR M63 #0 and the condition is 

TRUE. 

OlOly Decrement the CTR, then branch if the decre¬ 
mented CTR M63 =0 and the condition is 

TRUE. 

01 Izy Branch if the condition is TRUE. 

IzOOy Decrement the CTR, then branch if the decre¬ 
mented CTR M:63 #0. 

IzOly Decrement the CTR, then branch if the decre¬ 
mented CTR M:63 =0. 

Izlzz Branch always. 

Above, “z” denotes a bit that must be zero: if it is not 
zero the instruction form is invalid. 

The “y” bit provides a hint about whether a condi¬ 
tional branch is likely to be taken, and may be used 
by some implementations to improve performance. 

The “branch always” encoding of the BO field does 
not have a “y” bit. 

For Branch Conditional instructions that have a “y” 
bit, using y = 0 indicates that the following behavior is 
likely. 

■ If the instruction is f>c[f][a] with a negative value 
in the displacement field, the branch is taken. 

■ In all other cases (f>c[/][a] with a non-negative 
value in the displacement field, bc/r[f), or 
bcctr[f]), the branch falls through (is not taken). 

Using y = 1 reverses the preceding indications. 

The displacement field is used as described above 
even if the target is an absolute address. 
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— Programming Note - 

The default value for the “y” bit should be 0: the 
value 1 should be used only if software has deter¬ 
mined that the prediction corresponding to y= 1 is 
more likely to be correct than the prediction cor¬ 
responding to y = 0. 


Extended mnemonics for branches 

Many extended mnemonics are provided so that 
Branch Conditional instructions can be coded with the 
condition as part of the instruction mnemonic rather 
than as a numeric operand. Some of these are shown 
as examples with the Branch instructions. See 
Appendix C, “Assembler Extended Mnemonics” on 
page 223 for additional extended mnemonics. 


— Programming Note - 

In some implementations the processor may keep 
a stack of the Link Register values most recently 
set by Branch and Link instructions, with the pos¬ 
sible exception of the form shown below for 
obtaining the address of the next instruction. To 
benefit from this stack, the following programming 
conventions should be used. 

Let A, B, and Glue be programs. 

■ Obtaining the address of the next instruction: 
Use the following form of Branch and Link. 

bcl 20,31,$+4 

■ Loop counts: 

Keep them in the Count Register, and use 
one of the Branch Conditional instructions to 
decrement the count and to control branching 
(e.g., branching back to the start of a loop if 
the decremented counter value is non-zero). 

■ Computed goto's, case statements, etc.: 

Use the Count Register to hold the address to 
branch to, and use the bcctr instruction 
(LK = 0) to branch to the selected address. 

■ Direct subroutine linkage: 

Here A calls B and B returns to A. The two 
branches should be as follows. 

— A calls B: use a Branch instruction that 
sets the Link Register (LK = 1). 

— B returns to A: use the bclr instruction 
(LK = 0) (the return address is in, or can 
be restored to, the Link Register). 

■ Indirect subroutine linkage: 

Here A calls Glue, Glue calls B, and B returns 
to A rather than to Glue. (Such a calling 
sequence is common in linkage code used 
when the subroutine that the programmer 
wants to call, here B, is in a different module 
from the caller: the Binder inserts “glue” 
code to mediate the branch.) The three 
branches should be as follows. 

— A calls Glue: use a Branch instruction 
that sets the Link Register (LK = 1). 

— Glue calls B: place the address of B in 
the Count Register, and use the bcctr 
instruction (LK = 0). 

— B returns to A: use the bclr instruction 
(LK = 0) (the return address is in, or can 
be restored to, the Link Register). 
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Branch l-form 


Branch Conditional B-form 


b target_addr (AA = 0 LK = 0) be BO,BI,target_addr (AA = 0 LK = 0) 

ba target_addr (AA = 1 LK = 0) bca BO,BI,target_addr (AA= 1 LK = 0) 

bl target_addr (AA = 0 LK=1) bcl BO,BI,target_addr (AA = 0 LK = 1) 

bla target_addr (AA = 1 LK = 1) bcla BO,B!,target_addr (AA= 1 LK = 1) 


if AA then NIA <- iea EXTS(LI || 0b00) 
else NIA <- iea CIA + EXTS(LI i| 0b00) 

if LK then LR *- iea CIA + 4 

targetjaddr specifies the branch target address. 

If AA = 0 then the branch target address is the sum of 
LI || ObOO sign-extended and the address of this 
instruction, with the high-order 32 bits of the branch 
target address set to 0 in 32-bit mode of 64-bit imple¬ 
mentations. 

If AA = 1 then the branch target address is the value 
LI || ObOO sign-extended, with the high-order 32 bits of 
the branch target address set to 0 in 32-bit mode of 
64-bit implementations. 

If LK = 1 then the effective address of the instruction 
following the Branch instruction is placed into the Link 
Register. 

Special Registers Altered: 

LR (if LK= 1) 


if (64-bit implementation) & (64-bit mode) 
then M «- 0 
else M «- 32 

if -B0 2 then CTR *- CTR - 1 
ctr_ok <- B0 2 I C(CTR M:63 f 0) © B0 3 ) 
cond_ok «- B0 o I (CR B) s BO,) 
if ctr_ok & cond_ok then 
if AA then NIA <- iea EXTS(BD || 0b00) 
else NIA *- iea CIA + EXTS(BD || 0b00) 

if LK then LR «- iEa CIA + 4 

The Bl field specifies the bit in the Condition Register 
to be used as the condition of the branch. The BO 
field is used as described above, targetjaddr speci¬ 
fies the branch target address. 

If AA = 0 then the branch target address is the sum of 
BD || ObOO sign-extended and the address of this 
instruction, with the high-order 32 bits of the branch 
target address set to 0 in 32-bit mode of 64-bit imple¬ 
mentations. 

If AA = 1 then the branch target address is the value 
BD || ObOO sign-extended, with the high-order 32 bits 
of the branch target address set to 0 in 32-bit mode of 
64-bit implementations. 

If LK=1 then the effective address of the instruction 
following the Branch instruction is placed into the Link 
Register. 

Special Registers Altered: 

CTR (if BO 2 = 0) 

LR (if LK = 1) 

Extended Mnemonics: 

Examples of extended mnemonics for Branch Condi¬ 
tional: 

Extended: Equivalent to: 

bit target be 12,0,target 

bne cr2, target be 4,10, target 

bdnz target be 16,0,target 
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Branch Conditional to Link Register Branch Conditional to Count Register 

XL-form XL-form 


bclr BO,BI (LK = 0) 

bclrl BO.BI (LK = 1) 

[Power mnemonics: bcr, bcrl] 


19 

BO 

Bl 

III 

16 

m 

0 

6 

11 

16 

21 

m 


bcctr BO.BI (LK = 0) 

bcctrl BO.BI (LK-1) 

[Power mnemonics: bcc, bccl] 


mm 

BO 

Bl 

III 

528 

m 

■Hi 

6 

11 

16 

21 
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if (64-bit implementation) & (64-bit mode) 
then M «- 0 
else M <- 32 

if -B0 2 then CTR «* CTR - 1 

ctr ok <■ BOo I ((^T^k/i -63 I 0) © BO 3 ) 

cond_ok <- B0 o I (CR B r?BO,) 

if ctr_ok & cond_ok then NIA <-i ea LR 0:61 II 0b00 

if LK then LR <- iea CIA + 4 

The Bl field specifies the bit in the Condition Register 
to be used as the condition of the branch. The BO 
field is used as described above, and the branch 
target address is LR 0:61 || ObOO, with the high-order 32 
bits of the branch target address set to 0 in 32-bit 
mode of 64-bit implementations. 

If LK = 1 then the effective address of the instruction 
following the Branch instruction is placed into the Link 
Register. 

Special Registers Altered: 

CTR (if BO 2 = 0) 

LR (if LK = 1) 

Extended Mnemonics: 


cond_ok «- B0 o I (CR m s BO^ 
if cond_ok then NIA <- 1ea CTR 0;61 || 0b00 
if LK then LR <-i ea CIA + 4 

The Bl field specifies the bit in the Condition Register 
to be used as the condition of the branch. The BO 
field is used as described above, and the branch 
target address is CTR 0;61 j| ObOO, with the high-order 
32 bits of the branch target address set to 0 in 32-bit 
mode of 64-bit implementations. 

If LK = 1 then the effective address of the instruction 
following the Branch instruction is placed into the Link 
Register. 

If the “decrement and test CTR” option is specified 
(BO 2 = 0), the instruction form is invalid. 

Special Registers Altered: 

LR (if LK = 1) 

Extended Mnemonics: 

Examples of extended mnemonics for Branch Condi¬ 
tional To Count Register: 


Examples of extended mnemonics for Branch Condi¬ 
tional To Link Register: 


Extended: 

Equivalent to: 

bltlr 

bclr 

12,0 

bnelr cr2 

bclr 

4,10 

bdnzlr 

bclr 

16,0 


Extended: 

bltctr 

bnectr cr2 


Equivalent to. 

bcctr 12,0 
bcctr 4,10 
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2.4.2 System Call Instruction 


This instruction provides These instructions provide 
the means by which a program can call upon the 
system to perform a service. 


System Call SC-form 

sc 

[Power mnemonic: svca] 


17 
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11 
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31 


- Compatibility Note —-- 

For a discussion of Power compatibility with 
respect to instruction bits 16:29, please refer to 
Appendix G, “Incompatibilities with the Power 
Architecture” on page 257. For compatibility with 
future versions of this architecture, these bits 
should be coded as zero. 


This instruction calls the system to perform a service. 
A complete description of this instruction can be 
found in “System Call SC-form” on page 150. 

When control is returned to the program that exe¬ 
cuted the System Call, the content of the registers will 
depend on the register conventions used by the 
program providing the system service. 

This instruction is context synchronizing, see “System 
Call SC-form” on page 150. 

Special Registers Altered: 

Dependent on the system service 
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2.4.3 Condition Register Logical Instructions 


Extended mnemonics for Condition 
Register logical operations 

A set of extended mnemonics is provided that allow 
additional Condition Register logical operations, 


beyond those provided by the basic Condition Reg¬ 
ister Logical instructions, to be coded easily. Some of 
these are shown as examples with the CR. Logical 
instructions. See Appendix C, “Assembler Extended 
Mnemonics” on page 223 for additional extended 
mnemonics. 


Condition Register AND XL-form 

crand BT,BA,BB 


19 
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BB 
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CR B t «- cr ba & cr bb 

The bit in the Condition Register specified by BA is 
ANDed with the bit in the Condition Register specified 
by BB and the result is placed into the bit in the Con¬ 
dition Register specified by BT. 

Special Registers Altered: 

CR 


Condition Register XOR XL-form 

crxor BT,BA,BB 


19 

BT 

BA 

BB 
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CRbt *■ CR ba © CR bb 

The bit in the Condition Register specified by BA is 
XORed with the bit in the Condition Register specified 
by BB and the result is placed into the bit in the Con¬ 
dition Register specified by BT. 

Special Registers Altered: 

CR 

Extended Mnemonics: 

Example of extended mnemonics for Condition Reg¬ 
ister XOR: 

Extended: Equivalent to: 

crclr Bx crxor Bx,Bx,Bx 


Condition Register OR XL-form 

cror BT,BA,BB 


mm 

BT 

BA 

BB 

449 

/ 

■Hi 

6 

ii 

16 

21 

31 

CR B j *■ cr ba 1 CR bb 




The bit in the Condition Register specified by BA is 
ORed with the bit in the Condition Register specified 
by BB and the result is placed into the bit in the Con¬ 
dition Register specified by BT. 

Special Registers Altered: 
CR 




Extended Mnemonics: 




Example of extended mnemonics 
ister OR: 

for Condition Reg- 

Extended: 


Equivalent to: 


crmove Bx.By 


cror 

Bx.By, By 


Condition Register NAND 

XL-form 


crnand 

BT,BA,BB 




19 

BT 

BA 

BB 

225 

/ 

0 

6 

ii 

16 

21 

31 


CR B j *■ "’(CRba & C^bb) 

The bit in the Condition Register specified by BA is 
ANDed with the bit in the Condition Register specified 
by BB and the complemented result is placed into the 
bit in the Condition Register specified by BT. 

Special Registers Altered: 

CR 
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Condition Register NOR XL-form 

crnor BT.BA.BB 


19 

BT 

BA 

BB 

33 

D 

0 

6 


16 

21 

hi 


CR B t «- -i (CR ba I CR BB ) 

The bit in the Condition Register specified by BA is 
ORed with the bit in the Condition Register specified 
by BB and the complemented result is placed into the 
bit in the Condition Register specified by BT. 

Special Registers Altered: 

CR 

Extended Mnemonics: 

Example of extended mnemonics for Condition Reg¬ 
ister NOR: 

Extended: Equivalent to: 

crnot Bx,By crnor Bx.By.By 


Condition Register Equivalent XL-form 

creqv BT.BA.BB 


19 

BT 

BA 

BB 

289 

/ 

0 

6 

ii 

16 

21 

31 


CRgy «- CR ba = CR bb 

The bit in the Condition Register specified by BA is 
XORed with the bit in the Condition Register specified 
by BB and the complemented result is placed into the 
bit in the Condition Register specified by BT. 

Special Registers Altered: 

CR 

Extended Mnemonics: 

Example of extended mnemonics for Condition Reg¬ 
ister Equivalent: 

Extended: Equivalent to: 

crset Bx creqv Bx.Bx.Bx 


Condition Register AND With 
Complement XL-form 


crandc BT.BA.BB 


19 

BT 

BA 

BB 

129 

/ 

0 

6 

ii 

16 

21 

31 


CR B t *■ CR ba & ->cr bb 

The bit in the Condition Register specified by BA is 
ANDed with the complement of the bit in the Condi¬ 
tion Register specified by BB and the result is placed 
into the bit in the Condition Register specified by BT. 

Special Registers Altered: 

CR 


Condition Register OR With Complement 
XL-form 


crorc BT.BA.BB 


19 

BT 

BA 

BB 

417 

/ 

0 

6 

ii 

16 

21 

31 


CR B j «- CR ba I ->cr bb 

The bit in the Condition Register specified by BA is 
ORed with the complement of the bit in the Condition 
Register specified by BB and the result is placed into 
the bit in the Condition Register specified by BT. 

Special Registers Altered: 

CR 
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2.4.4 Condition Register Field 
Instruction 


Move Condition Register Field XL-form 

mcrf BF,BFA 


19 

BF 

B 

BFA 

// 

III 

0 

/ 

0 

6 

H 

11 

14 

16 

21 

31 


CR4xBF:4*BF + 3 *" ^^4xBFA:4xBFA+3 

The contents of Condition Register field BFA are 
copied into Condition Register field BF. 

Special Registers Altered: 

CR 
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Chapter 3. Fixed-Point Processor 


3.1 Fixed-Point Processor 
Overview 

This chapter describes the registers and instructions 
that make up the Fixed-Point Processor facility. 
Section 3.2, “Fixed-Point Processor Registers” on 
page 27 describes the registers associated with the 
Fixed-Point Processor. Section 3.3, “Fixed-Point 
Processor Instructions” on page 29 describes the 
instructions associated with the Fixed-Point Processor. 

3.2 Fixed-Point Processor 
Registers 

3.2.1 General Purpose Registers 

All manipulation of information is done in registers 
internal to the Fixed-Point Processor. The principal 
storage internal to the Fixed-Point Processor is a set 
of 32 general purpose registers (GPRs). See 
Figure 21. 



0 63 


Figure 21. General Purpose Registers 
Each GPR is a 64-bit register. 

3.2.2 Fixed-Point Exception Register 

The Fixed-Point Exception Register (XER) is a 32-bit 
register. 


XER 

0 31 

Figure 22. Fixed-Point Exception Register 

The bit definitions for the Fixed-Point Exception Reg¬ 
ister are as shown below. Here M = 0 in 64-bit mode 
and M — 32 in 32-bit mode. 

The bits are set based on the operation of an instruc¬ 
tion considered as a whole, not on intermediate 
results (e.g., the Subtract From Carrying instruction, 
the result of which is specified as the sum of three 
values, sets bits in the Fixed-Point Exception Register 
based on the entire operation, not on an intermediate 
sum). 

Bit(s) Description 

0 Summary Overflow (SO) 

The Summary Overflow bit is set to one 
whenever an instruction (except mtspr) sets 
the Overflow bit to indicate overflow. Once 
set, the SO bit remains set until it is cleared 
by an mtspr instruction (specifying the XER) 
or an mcrxr instruction. It is not altered by 
Compare instructions, nor by other 
instructions (except mtspr to the XER, and 
mcrxr) that cannot overflow. Executing an 
mtspr instruction to the XER, supplying the 
values zero for SO and one for OV, causes 
SO to be set to zero and OV to be set to one. 

1 Overflow (OV) 

The Overflow bit is set to indicate that an 
overflow has occurred during execution of an 
instruction. XO-form Add and Subtract 
instructions having OE=1 set it to one if the 
carry out of bit M is not equal to the carry 
out of bit M+\, and set it to zero otherwise. 
The OV bit is not altered by Compare 
instructions, nor by other instructions (except 
mtspr to the XER, and mcrxr) that cannot 
overflow. 
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2 Carry (CA) 

In general, the Carry bit is set to indicate that 
a carry out of bit M has occurred during exe¬ 
cution of an instruction. Add Carrying, Sub¬ 
tract From Carrying, Add Extended, and 
Subtract From Extended instructions set it to 
one if there is a carry out of bit M, and set it 
to zero otherwise. However, Shift Right Alge¬ 
braic instructions set the CA bit to indicate 
whether any 'V bits have been shifted out of 
a negative quantity. The CA bit is not altered 
by Compare instructions, nor by other 
instructions (except Shift Right Algebraic, 
mtspr to the XER, and mcrxr) that cannot 
carry. 


3:24 Reserved 

25:31 This field specifies the number of bytes to be 
transferred by a Load String Indexed or Store 
String Indexed instruction. 

- Compatibility Note - 

For a discussion of Power compatibility with 
respect to XER bits 16:23, please refer to 
Appendix G, “Incompatibilities with the Power 
Architecture" on page 257. For compatibility with 
future versions of this architecture, these bits 
should be set to zero. 
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3.3 Fixed-Point Processor Instructions 


This section describes the instructions executed by the Fixed-Point processor. 


3.3.1 Storage Access Instructions 

The Storage Access instructions compute the effective 
address (EA) of the storage to be accessed as 
described in Section 1.11.2, “Effective Address 
Calculation" on page 15. 

The order of bytes accessed by halfword, word, and 
doubleword loads and stores is Big-Endian, unless 
Little-Endian storage ordering is selected as 
described in Appendix D, “Little-Endian Byte 
Ordering” on page 235. 

- Programming Note - 

The “la” extended mnemonic permits computing 
an Effective Address as a Load or Store instruc¬ 
tion would, but loads the address itself into a GPR 
rather than loading the value that is in storage at 
that address. This extended mnemonic is 
described in “Load Address” on page 234. 


3.3.1.1 Storage Access Exceptions 

Storage accesses will cause the system error handler 
to be invoked if the program is not allowed to modify 
the target storage (Store only), or if the program 
attempts to access storage that is unavailable. 


3.3.2 Fixed-Point Load Instructions 

The byte, halfword, word, or doubleword in storage 
addressed by EA is loaded into register RT. 

Byte order of PowerPC is Big-Endian by default; see 
Appendix D, “Little-Endian Byte Ordering” on 
page 235 for PowerPC systems operated with Little- 
Endian byte ordering. 

Many of the Load instructions have an “update” form, 
in which register RA is updated with the effective 
address. For these forms, if RA#0 and RA?fcRT, the 
effective address is placed into register RA and the 
storage element (byte, halfword, word, or doubleword) 
addressed by EA is loaded into RT. 

- Programming Note - 

In some implementations, the Load Algebraic and 
Load with Update instructions may have greater 
latency than other types of Load instructions. 
Moreover, Load with Update instructions may take 
longer to execute in some implementations than 
the corresponding pair of a non-update Load 
instruction and an Add instruction. 
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Load Byte and Zero D-form 


Ibz RT,D(RA) 


CT 

RT 

RA 


D 


■Ml 

6 


16 


31 


if RA = 0 then b <- 0 
else b (RA) 

EA <- b + EXTS(D) 

RT <- 56 0 || MEM(EA, 1) 

Let the effective address (EA) be the sum (RA|0)+D. 
The byte in storage addressed by EA is loaded into 
RT 56:63- RT 0:55 are set to °- 

Special Registers Altered: 

None 


Load Byte and Zero Indexed X-form 


Ibzx RT,RA,RB 


31 

RT 

RA 

RB 

87 

/ 

0 

6 

ii 

16 

21 

31 


if RA = 0 then b «- 0 
else b *■ (RA) 

EA «- b + (RB) 

RT <- 56 0 || MEM(EA, 1) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). The byte in storage addressed by EA is 
loaded into RT 56;63 . RT 0;55 are set to 0. 

Special Registers Altered: 

None 


Load Byte and Zero with Update 
D-form 


Load Byte and Zero with Update 
Indexed X-form 


ibzu RT,D(RA) Ibzux RT,RA,RB 


31 

RT 

RA 

RB 

119 

7 

0 

6 

ii 

16 

21 

31 


35 

RT 

RA 


D 


0 

6 

ii 

16 


31 


EA «- (RA) + EXTS(D) 

RT «- 56 0 || MEM(EA, 1) 

RA «- EA 

Let the effective address (EA) be the sum (RA)+D. 
The byte in storage addressed by EA is loaded into 
RT S6:63- RT 0:55 are set to °- 

EA is placed into register RA. 

If RA = 0 or RA = RT, the instruction form is invalid. 

Special Registers Altered: 

None 


EA «- (RA) + (RB) 

RT «- 56 0 || MEM(EA, 1) 

RA <- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
The byte in storage addressed by EA is loaded into 
RT 56:63‘ RT 0:55 are set to °- 

EA is placed into register RA. 

If RA = 0 or RA = RT, the instruction form is invalid. 

Special Registers Altered: 

None 
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Load Halfword and Zero D-form 


Load Halfword and Zero Indexed 
X-form 



if RA = 0 then b «- 0 

else b ( RA ) if RA = 0 then b <- 0 

EA «- b + EXTS(D) else be (ra) 

RT «- 48 G || MEM(EA, 2) EA e b + (RB) 

Let the effective address (EA) be the sum (RA|0)+D. * 480 M M EM(EA, 2) 

The halfword in storage addressed by EA is loaded |_ et the effective address (EA) be the sum 

into RT 48;63 . RT 0 47 are set to 0. (RA|0) + (RB). The halfword in storage addressed by 

EA is loaded into RT 48:63 . RT 0 47 are set to 0. 

Special Registers Altered: 

None Special Registers Altered: 

None 


Load Halfword and Zero with Update Load Halfword and Zero with Update 

D-form Indexed X-form 


Ihzu RT,D(RA) 


Ihzux RT,RA,RB 


41 

RT 

RA 


D 


0 

e 

11 

16 


31 


31 

RT 

RA 

RB 

311 

/ 

QBBESB51 

6 

11 

16 

21 

31 


EA «- (RA) + EXTS(D) 

RT «- 48 0 || MEM(EA, 2) 

RA «- EA 

Let the effective address (EA) be the sum (RA) + D. 
The halfword in storage addressed by EA is loaded 
into RT 48;63 . RT 0:47 are set to 0. 

EA is placed into register RA. 

If RA^O or RA = RT, the instruction form is invalid. 

Special Registers Altered: 

None 


EA «- (RA) + (RB) 

RT «- 48 0 || MEM(EA, 2) 

RA <- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
The halfword in storage addressed by EA is loaded 
into RT 48;63 . RT 0:47 are set to 0. 

EA is placed into register RA. 

If RA = 0 or RA=RT, the instruction form is invalid. 

Special Registers Altered: 

None 
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Load Halfword Algebraic D-form 


lha RT.D(RA) 



if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + EXTS(D) 

RT 4- EXTS(MEM(EA, 2)) 

Let the effective address (EA) be the sum (RA|0) + D. 
The halfword in storage addressed by EA is loaded 
into RT 48;63 . RT 0;47 are filled with a copy of bit 0 of 
the loaded halfword. 

Special Registers Altered: 

None 


Load Halfword Algebraic Indexed 
X-form 

lhax RT,RA,RB 


31 RT RA RB 343 / 



if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + (RB) 

RT «- EXTS(MEM(EA, 2)) 

Let the effective address (EA) be the sum 
(RA|0)+(RB). The halfword in storage addressed by 
EA is loaded into RT 48;63 . RT 0:47 are filled with a copy 
of bit 0 of the loaded halfword. 

Special Registers Altered: 

None 


Load Halfword Algebraic with Update 
D-form 


lhau RT.D(RA) 


43 

RT 

RA 


D 


0 

6 

ii 

16 


31 


EA (RA) + EXTS(D) 

RT <- EXTS(MEM(EA, 2)) 

RA <- EA 

Let the effective address (EA) be the sum (RA)+D. 
The halfword in storage addressed by EA is loaded 
into RT 48;63 . RT 0 47 are filled with a copy of bit 0 of 
the loaded halfword. 

EA is placed into register RA. 

If RA = 0 or RA=RT, the instruction form is invalid. 

Special Registers Altered: 

None 


Load Halfword Algebraic with Update 
Indexed X-form 

ihaux RT,RA,RB 

31 RT RA RB 375 / 

0 6 11 16 21 31 


EA «- (RA) + (RB) 

RT 4- EXTS(MEM(EA, 2)) 

RA «- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
The halfword in storage addressed by EA is loaded 
into RT 48 63 . RT 0:47 are filled with a copy of bit 0 of 
the loaded halfword. 

EA is placed into register RA. 

If RA = 0 or RA = RT, the instruction form is invalid. 

Special Registers Altered: 

None 
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Load Word and Zero D-form 


Load Word and Zero Indexed X-form 


Iwz RT,D(RA) 

[Power mnemonic: I] 


32 

RT 

RA 


D 


0 

6 

11 

16 


31 


Iwzx RT,RA,RB 

[Power mnemonic: lx] 


31 

RT 

RA 

RB 

23 

/ 

0 

6 

ii 

16 

21 

31 


if RA = 0 then b «- 0 
else b<- (RA) 

EA e b + EXTS(D) 

RT e 3 2 0 || MEM(EA, 4) 

Let the effective address (EA) be the sum (RA|0) + D. 
The word in storage addressed by EA is loaded into 
RT 32 ; 63 - RT 0 3 i are set to 0 . 

Special Registers Altered: 

None 


if RA = 0 then b <• 0 
else be (RA) 

EA e b + (RB) 

RT e 32 0 || MEM(EA, 4) 

Let the effective address (EA) be the sum 
(RA|0)+(RB). The word in storage addressed by EA 
is loaded into RT 32 63 . RT 0 31 are set to 0. 

Special Registers Altered: 

None 


Load Word and Zero with Update 
D-form 


Iwzu RT.D(RA) 

[Power mnemonic: lu] 


33 

RT 

RA 


D 


0 

6 

ii 

16 


31 


Load Word and Zero with Update 
Indexed X-form 

Iwzux RT,RA,RB 
[Power mnemonic: lux] 


31 

RT 

RA 

RB 

55 

/ 

0 

6 

ii 

16 

21 

31 


EA e (RA) + EXTS(D) 

RT e 32g || MEM(EA, 4) 

RA e EA 

Let the effective address (EA) be the sum (RA) + D. 
The word in storage addressed by EA is loaded into 
R^32:63- RTo -31 are set to 0 . 

EA is placed into register RA. 

If RA = 0 or RA=RT, the instruction form is invalid. 

Special Registers Altered: 

None 


EA *■ (RA) + (RB) 

RT <- 32 0 || MEM(EA, 4) 

RA <- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
The word in storage addressed by EA is loaded into 
R^ 32 : 63 - RT 0:31 are set to 0. 

EA is placed into register RA. 

If RA = 0 or RA=RT, the instruction form is invalid. 

Special Registers Altered: 

None 
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Load Word Algebraic DS-form 


Load Word Algebraic Indexed X-form 


Iwa RT,DS(RA) 


Iwax RT,RA,RB 


58 

RT 

RA 

DS 

2 

0 

6 

ii 

16 

30 31 


mm 

RT 

RA 

RB 

341 

/ 

■■ 

6 

ii 

16 

21 

31 


if RA = 0 then b *• 0 
else b «- (RA) 

EA <- b + EXTS(DS||0b00) 

RT «- EXTS(MEM(EA, 4)) 

Let the effective address (EA) be the sum 
(RA|0) +(DS||ObOO). The word in storage addressed by 
EA is loaded into RT 32;63 . RT 0;31 are filled with a copy 
of bit 0 of the loaded word. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b *■ (RA) 

EA «■ b + (RB) 

RT 4- EXTS(MEM(EA, 4)) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). The word in storage addressed by EA 
is loaded into RT 32;63 . RT 0;31 are filled with a copy of 
bit 0 of the loaded word. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


Load Word Algebraic with Update 
Indexed X-form 


Iwaux RT,RA,RB 


31 

RT 

RA 

RB 

373 

/ 

0 

6 

ii 

16 

21 

31 


EA <- (RA) + (RB) 

RT 4- EXTS(MEM(EA, 4)) 

RA 4- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
The word in storage addressed by EA is loaded into 
RT 32 63 - RT 0 . 31 are filled with a copy of bit 0 of the 

loaded word. 

EA is placed into register RA. 

If RA = 0 or RA = RT, the instruction form is invalid. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 
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Load Doubleword DS-form 


Load Doubleword Indexed X-form 


Id RT.DS(RA) 


Idx RT,RA,RB 


31 

RT 

RA 

RB 

21 

/ 

0 

6 

ii 

16 

21 

31 


58 

RT 

RA 

DS 

0 

0 

6 

ii 

16 

30 31 


if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + EXTS(DS||0b00) 

RT <- MEM(EA, 8) 

Let the effective address (EA) be the sum 
(RA|0) + (DS||ObOO). The doubleword in storage 
addressed by EA is loaded into RT. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


if RA = 0 then b *■ 0 
else b «- (RA) 

EA <- b + (RB) 

RT <- MEM(EA, 8) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). The doubleword in storage addressed 
by EA is loaded into RT. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


Load Doubleword with Update DS-form 

Idu RT,DS(RA) 


58 

RT 

RA 

DS 

1 

0 

6 

ii 

16 

30 31 


EA <- (RA) + EXTS(DS||0b00) 

RT 4- MEM(EA, 8) 

RA <- EA 

Let the effective address (EA) be the sum 
(RA) + (DS||0b00). The doubleword in storage 
addressed by EA is loaded into RT. 

EA is placed into register RA. 

If RA = 0 or RA=RT, the instruction form is invalid. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


Load Doubleword with Update Indexed 
X-form 


Idux RT,RA,RB 


31 

RT 

RA 

RB 

53 

/ 

0 

6 

ii 

16 

21 

31 


EA <- (RA) + (RB) 

RT 4- MEM(EA, 8) 

RA 4- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
The doubleword in storage addressed by EA is loaded 
into RT. 

EA is placed into register RA. 

If RA = 0 or RA= RT, the instruction form is invalid. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 
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3.3.3 Fixed-Point Store Instructions 


The contents of register RS is stored into the byte, 
halfword, word, or doubleword in storage addressed 
by EA. 

Byte order of PowerPC is Big-Endian by default; see 
Appendix D, “Little-Endian Byte Ordering” on 
page 235 for PowerPC systems operated with Little- 
Endian byte ordering. 


Many of the Store instructions have an “update” form, 
in which register RA is updated with the effective 
address. For these forms, the following rules apply. 

■ If RA?fcO, the effective address is placed into reg¬ 
ister RA. 

■ If RS = RA, the contents of register RS is copied 
to the target storage element and then EA is 
placed into RA (RS). 


Store Byte D-form Store Byte Indexed X-form 

stb RS,D(RA) stbx RS,RA,RB 


38 

RS 

RA 


D 


0 

6 

11 

16 


31 



if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + EXTS(D) 

MEM(EA, 1) «- (RS) 56;63 

Let the effective address (EA) be the sum (RA|0)+D. 
(RS ) 56:6 3 is stored into the byte in storage addressed 
by EA. 

Special Registers Altered: 

None 


if RA = 0 then b <- 0 
else b «- (RA) 

EA <- b + (RB) 

MEM(EA, 1) <- (RS) 56:63 

Let the effective address (EA) be the sum 
(RA|0) + (RB). (RS)^.^ is stored into the byte in 
storage addressed by EA. 

Special Registers Altered: 

None 


Store Byte with Update D-form 

stbu RS,D(RA) 


39 

RS 

RA 


D 


0 

6 

11 

16 


31 


EA <- (RA) + EXTS(D) 

MEM(EA, 1) <- (RS) 56 -63 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. 
(RS ) 56 63 is stored into the byte in storage addressed 
by EA. 

EA is placed into register RA. 


Store Byte with Update Indexed X-form 

stbux RS,RA,RB 

31 RS RA RB 247 / 

0 6 11 16 21 31 


EA <- (RA) + (RB) 

MEM(EA, 1) «- (RS)56 63 
RA <- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
(RS )56 63 *s stored into the byte in storage addressed 
by EA. 

EA is placed into register RA. 


If RA = 0, the instruction form is invalid. If RA = 0, the instruction form is invalid. 

Special Registers Altered: Special Registers Altered: 

None None 
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Store Halfword D-form 


Store Halfword Indexed X-form 


sth RS.D(RA) sthx RS,RA,RB 


31 

RS 

RA 

RB 

407 

/ 

0 

6 

ii 

16 

21 

31 


44 

RS 

RA 


D 


0 

6 

ii 

16 


31 


if RA = 0 then b «- 0 
else b «- (RA) 

EA «■ b + EXTS(D) 

MEM(EA, 2) «- (RS) 48:63 

Let the effective address (EA) be the sum (RA|0) + D. 
(RS ) 486 3 > s stored into the halfword in storage 
addressed by EA. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b «- (RA) 

EA *- b + (RB) 

MEM(EA, 2) «- (RS) 48;63 

Let the effective address (EA) be the sum 
(RA|0) + (RB). (RS) 48:63 is stored into the halfword in 
storage addressed by EA. 

Special Registers Altered: 

None 


Store Halfword with Update D-form 


sthu RS,D(RA) 


45 

RS 

RA 


D 


0 

6 

11 

16 


31 


EA «- (RA) + EXTS(D) 

MEM(EA, 2) 4- (RS) 4863 
RA «- EA 

Let the effective address (EA) be the sum (RA)+D. 
(RS) 48:63 is stored into the halfword in storage 
addressed by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 


Store Halfword with Update Indexed 
X-form 


sthux RS,RA,RB 
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EA <- (RA) + (RB) 

MEM(EA, 2) 4- (RS) 48 63 
RA «- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
(RS) 48:63 is stored into the halfword in storage 
addressed by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 
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Store Word D-form 

stw RS,D(RA) 

[Power mnemonic: st] 
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Store Word Indexed X-form 

stwx RS,RA,RB 

[Power mnemonic: stx] 

31 RS RA RB 151 / 

0 6 11 16 21 31 


if RA = 0 then b «- 0 
else b *■ (RA) 

EA «- b + EXTS(D) 

MEM(EA, 4) <- (RS) 32:63 

Let the effective address (EA) be the sum (RA|0)+D. 
(RS) 3 263 ' s stored into the word in storage addressed 
by EA. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + (RB) 

MEM(EA, 4) «■ (RS) 32;63 

Let the effective address (EA) be the sum 
(RA|0)+(RB). (RS) 32:63 is stored into the word in 

storage addressed by EA. 

Special Registers Altered: 

None 


Store Word with Update D-form Store Word with Update Indexed X-form 

stwu RS,D(RA) stwux RS,RA,RB 

[Power mnemonic: stu] [Power mnemonic: stux] 
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EA «- (RA) + EXTS(D) 

MEM(EA, 4) 4- (RS) 3263 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. 
(RS) 32 6 3 is stored into the word in storage addressed 
by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 


EA (RA) + (RB) 

MEM(EA, 4) «■ (RS)32.33 
RA «- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
(RS ) 32 £3 is stored into the word in storage addressed 
by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 


Special Registers Altered: 
None 


Special Registers Altered: 
None 
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Store Doubleword DS-form 


Store Doubleword Indexed X-form 


std RS,DS(RA) stdx RS,RA,RB 
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if RA = 0 then b «- 0 
else b *■ (RA) 

EA «■ b + EXTS (DS||Ob00) 

MEM(EA, 8) «■ (RS) 

Let the effective address (EA) be the sum 
(RA|0) + (DS||ObOO). (RS) is stored into the 
doubleword in storage addressed by EA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


if RA = 0 then b <- 0 
else b «- (RA) 

EA <- b + (RB) 

MEM(EA, 8) <- (RS) 

Let the effective address (EA) be the sum 
(RA|0)+(RB). (RS) is stored into the doubleword in 
storage addressed by EA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


Store Doubleword with Update DS-form 


stdu RS.DS(RA) 
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EA <- (RA) + EXTS (DS||Ob00) 

MEM(EA, 8) <- (RS) 

RA <- EA 

Let the effective address (EA) be the sum 
(RA) +(DS||ObOO). (RS) is stored into the doubleword 
in storage addressed by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 


Store Doubleword with Update Indexed 
X-form 


stdux RS,RA,RB 
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EA <- (RA) + (RB) 

MEM(EA, 8) «- (RS) 

RA <- EA 

Let the effective address (EA) be the sum (RA) + (RB). 
(RS) is stored into the doubleword in storage 
addressed by EA. 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 
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3.3.4 Fixed-Point Load and Store with Byte Reversal instructions 


When used in a PowerPC system operating with Big- 
Endian byte order (the default), these instructions 
have the effect of loading and storing data in Little- 
Endian order. Likewise, when used in a PowerPC 
system operating with Little-Endian byte order, these 
instructions have the effect of loading and storing 
data in Big-Endian order. See Appendix D, “Little- 


Endian Byte Ordering” on page 235 for a discussion 
of byte order. 

- Programming Note - 

In some implementations, the Load Byte-Reverse 
instructions may have greater latency than other 
Load instructions. 


Load Halfword Byte-Reverse Indexed 
X-form 


Ihbrx RT,RA,RB 


31 

RT 

RA 

RB 

790 

/ 

0 

6 


16 

21 

31 


if RA = 0 then b <- 0 
else b «- (RA) 

EA «- b + (RB) 

RT <- 48 0 || MEM(EA+1, 1) || MEM(EA, 1) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). Bits 0:7 of the halfword in storage 
addressed by EA are loaded into RT 56;63 . Bits 8:15 of 
the halfword in storage addressed by EA are loaded 
into RT 48 ; 55 - RT 0:47 are set to 0. 

Special Registers Altered: 

None 


Load Word Byte-Reverse Indexed 
X-form 


Iwbrx RT,RA,RB 
[Power mnemonic: Ibrx] 
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if RA = 0 then b <■ 0 
else b «- (RA) 

EA «- b + (RB) 

RT «- ^0 || MEM(EA+3, 1) || MEM(EA+2, 1) 

|| MEM(EA+1, 1) || MEM(EA, 1) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). Bits 0:7 of the word in storage 
addressed by EA are loaded into RT^^. Bits 8:15 of 
the word in storage addressed by EA are loaded into 
RT 4855 . Bits 16:23 of the word in storage addressed 
by EA are loaded into RT 40;47 . Bits 24:31 of the word 
in storage addressed by EA are loaded into RT 32 ; 3 9 . 
RT 0 ; 3 i are set to 0. 

Special Registers Altered: 

None 
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Store Halfword Byte-Reverse Indexed 
X-form 


sthbrx RS,RA,RB 



0 6 11 16 21 31 


if RA = 0 then b «- 0 
else b *■ (RA) 

EA «• b + (RB) 

MEM(EA, 2) «■ (RS) 56:63 || (RS) 48:55 

Let the effective address (EA) be the sum 
(RA|0) + (RB). (RS) 56 63 are stored into bits 0:7 of the 
halfword in storage addressed by EA. (RS) 48;55 are 
stored into bits 8:15 of the halfword in storage 
addressed by EA. 

Special Registers Altered: 

None 


Store Word Byte-Reverse Indexed 
X-form 

stwbrx RS,RA,RB 
[Power mnemonic: stbrx] 


31 RS RA RB 662 / 



if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + (RB) 

MEM(EA, 4) «- (RS) 56; 6 3 II (RS) 48;55 || (RS) 40:47 II (RS) 32;39 

Let the effective address (EA) be the sum 
(RA|0) + (RB). (RS) 56 63 are stored into bits 0:7 of the 
word in storage addressed by EA. (RS) 48 . 55 are stored 
into bits 8:15 of the word in storage addressed by EA. 
(RS) 40;47 are stored into bits 16:23 of the word in 
storage addressed by EA. (RS) 3239 are stored into 
bits 24:31 of the word in storage addressed by EA. 

Special Registers Altered: 

None 
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3.3.5 Fixed-Point Load and Store Multiple Instructions 


The Load/Store Multiple instructions have preferred 
forms: see Section 1.9.1, “Preferred Instruction 

Forms” on page 13. In the preferred forms, storage 
alignment satisfies the following rule. 

■ The combination of the EA and RT (RS) is such 
that the low-order byte of GPR 31 is loaded 
(stored) from (into) the last byte of an aligned 
quadword in storage. 

On PowerPC systems operating with Little-Endian byte 
order, execution of a Load Multiple or Store Multiple 
instruction causes the system alignment error handler 


to be invoked. See Appendix D, “Little-Endian Byte 
Ordering” on page 235. 

- Compatibility Note - 

For a discussion of Power compatibility with 
respect to the alignment of the EA for the Load 
Multiple Word and Store Multiple Word 
instructions, please refer to Appendix G, “Incom¬ 
patibilities with the Power Architecture” on 
page 257. For compatibility with future versions 
of this architecture, these EAs should be word- 
aligned. 


Load Multiple Word D-form 

Imw RT,D(RA) 

[Power mnemonic: Im] 


Store Multiple Word D-form 

stmw RS,D(RA) 

[Power mnemonic: stm] 
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if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + EXTS(D) 
r <- RT 

do while r < 31 
GPR(r) «- 32 0 || MEM(EA, 4) 
r «- r + 1 
EA «■ EA + 4 

Let n = (32—RT). Let the effective address (EA) be 
the sum (RA|0) + D. 

n consecutive words starting at EA are loaded into 
the low-order 32 bits of GPRs RT through 31. The 
high-order 32 bits of these GPRs are set to zero. 

EA must be a multiple of 4. If it is not, the system 
alignment error handler may be invoked or the results 
may be boundedly undefined. 

If RA is in the range of registers to be loaded or 
RT=RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + EXTS(D) 
r «- RS 

do while r £ 31 
MEM(EA, 4) «- GPR(r) 32:63 
r <- r + 1 
EA «- EA + 4 

Let n = (32—RS). Let the effective address (EA) be 
the sum (RA|0)+ D. 

n consecutive words starting at EA are stored from 
the low-order 32 bits of GPRs RS through 31. 

EA must be a multiple of 4. If it is not, the system 
alignment error handler may be invoked or the results 
may be boundedly undefined. 

Special Registers Altered: 

None 
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3.3.6 Fixed-Point Move Assist Instructions 


The Move Assist instructions allow movement of data 
from storage to registers or from registers to storage 
without concern for alignment. These instructions can 
be used for a short move between arbitrary storage 
locations or to initiate a long move between unaligned 
storage fields. 

Load/Store String Indexed instructions of zero length 
have no effect, except that Load String Indexed 
instructions of zero length may set register RT to an 
undefined value. 

The Load/Store String instructions have preferred 
forms: see Section 1.9.1, “Preferred Instruction 


Forms" on page 13. In the preferred forms, register 
usage satisfies the following rules. 

■ RS = 5 

■ RT = 5 

■ last register loaded/stored < 12 

On PowerPC systems operating with Little-Endian byte 
order, execution of a Load/Store String instruction 
causes the system alignment error handler to be 
invoked. See Appendix D, “Little-Endian Byte 
Ordering” on page 235. 
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Load String Word Immediate X-form 


Iswi RT,RA,NB 

[Power mnemonic: Isi] 
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if RA = 0 then EA «- 0 
else EA <- (RA) 

if NB = 0 then n «- 32 
else n «- NB 

r f RT - 1 
i <- 32 

do while n > 0 
if i =32 then 
r <r r + 1 (mod 32) 

GPR(r) <- 0 

GPR(r ) i;j+7 «- MEM(EA, 1) 
i *■ i + 8 

if i =64 then i <- 32 
EA «- EA + 1 
n <- n - 1 

Let the effective address (EA) be (RA|0). Let n = NB 
if NB^O, n = 32 if NB = 0: n is the number of bytes to 
load. Let nr = CEIL(n+-4): nr is the number of regis¬ 
ters to receive data. 

n consecutive bytes starting at EA are loaded into 
GPRs RT through RT+nr—1. Data is loaded into the 
low-order four bytes of each GPR; the high-order four 
bytes are set to 0. 

Bytes are loaded left to right in each register. The 
sequence of registers wraps around to GPR 0 if 
required. If the low-order four bytes of register 
RT+nr—1 are only partially filled, the unfilled low- 
order byte(s) of that register are set to 0. 

If RA is in the range of registers to be loaded or 
RT=RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 


Load String Word Indexed X-form 


Iswx RT,RA,RB 

[Power mnemonic: Isx] 
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if RA = 0 then b «- 0 
else b «- (RA) 

EA «• b + (RB) 
n «- XER 25 31 
r <- RT - 1 
i <- 32 

RT «- undefined 
do while n > 0 
if i =32 then 
r «- r + 1 (mod 32) 

GPR(r) «- 9 

GPR(r ) j;i+7 «- MEM(EA, 1) 
i «- i + 8 

if i =64 then i «- 32 
EA *■ EA + 1 
n *• n - 1 

Let the effective address (EA) be the sum 
(RA|0) + (RB). Let n = XER 25 31 : n is the number of 
bytes to load. Let nr = CEIL(n+4): nr is the number 
of registers to receive data. 

If n>0, n consecutive bytes starting at EA are loaded 
into GPRs RT through RT+nr—1. Data is loaded into 
the low-order four bytes of each GPR; the high-order 
four bytes are set to 0. 

Bytes are loaded left to right in each register. The 
sequence of registers wraps around to GPR 0 if 
required. If the low-order four bytes of register 
RT+nr—1 are only partially filled, the unfilled low- 
order byte(s) of that register are set to 0. 

If n = 0, the content of register RT is undefined. 

If RA or RB is in the range of registers to be loaded 
or RT=RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 
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Store String Word Immediate X-form 


Store String Word Indexed X-form 


stswi RS,RA,NB 

[Power mnemonic: stsi] 
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stswx RS,RA,RB 
[Power mnemonic: stsx] 
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if RA = 0 then EA «- 0 
else EA «- (RA) 

if NB = 0 then n «- 32 
else n «- NB 

r «- RS - 1 
i 32 

do while n > 0 

if i = 32 then r «- r + 1 (mod 32) 

MEM(EA, 1) «■ GPR(r) j;j + 7 
i «- i + 8 

if i = 64 then i «- 32 
EA «- EA + 1 
n <- n - 1 

Let the effective address (EA) be (RA|0j. Let n = NB 
if NB^O, n = 32 if NB = 0: n is the number of bytes to 
store. Let nr = CEIL(n+4): nr is the number of regis¬ 
ters to supply data. 

n consecutive bytes starting at EA are stored from 
GPRs RS through RS + nr— 1. Data is stored from the 
low-order four bytes of each GPR. 

Bytes are stored left to right from each register. The 
sequence of registers wraps around to GPR 0 if 
required. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + (RB) 
n «- XER 2531 
r «- RS - 1 
i <- 32 

do while n > 0 

if i = 32 then r «- r + 1 (mod 32) 

MEM(EA, 1) 4 - GPR(r)j. j + 7 
i «- i + 8 

if i =64 then i «- 32 
EA «- EA + 1 
n «- n - 1 

Let the effective address (EA) be the sum 
(RA|0) + (RB). Let n = XER 2531 : n is the number of 
bytes to store. Let nr = CEIL(n+4): nr is the number 
of registers to supply data. 

n consecutive bytes starting at EA are stored from 
GPRs RS through RS + nr—1. Data is stored from the 
low-order four bytes of each GPR. 

Bytes are stored left to right from each register. The 
sequence of registers wraps around to GPR 0 if 
required. 

Special Registers Altered: 

None 
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3.3.7 Storage Synchronization Instructions 


The Storage Synchronization instructions can be used 
to control the order in which storage operations are 
completed with respect to asynchronous events, and 
the order in which storage operations are seen by 
other processors and by other mechanisms that 
access storage. Additional information about these 
instructions, and about related aspects of storage 
management, can be found in Part 2, “PowerPC 
Virtual Environment Architecture” on page 117, and 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141. 


— Programming Note - 

Because the Storage Synchronization instructions 
have implementation dependencies (e.g., the 
granularity at which reservations are managed), 
they must be used with care. The operating 
system should provide system library programs 
that use these instructions to implement the high- 
level synchronization functions (Test and Set, 
Compare and Swap, etc.) needed by application 
programs. Application programs should use these 
library programs, rather than use the Storage 
Synchronization instructions directly. 


Load Word And Reserve Indexed 
X-form 

Iwarx RT,RA,RB 
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Load Doubleword And Reserve Indexed 
X-form 


Idarx RT,RA,RB 
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if RA = 0 then b «- 0 
el se b «- (RA) 

EA <- b + (RB) 

RESERVE <- 1 

RESERVE_ADDR «- func(EA) 

RT «- 32 0 || MEM(EA, 4) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). The word in storage addressed by EA 
is loaded into RT 32:63 . RT 0;31 are set to 0. 

This instruction creates a reservation for use by a 
Store Word Conditional instruction. An address com¬ 
puted from the EA is associated with the reservation, 
and replaces any address previously associated with 
the reservation: the manner in which the address to 
be associated with the reservation is computed from 
the EA is described in Part 2, “PowerPC Virtual Envi¬ 
ronment Architecture” on page 117. 

EA must be a multiple of 4. If it is not, the system 
alignment error handler may be invoked or the results 
may be boundedly undefined. 

Special Registers Altered: 

None 


if RA = 0 then b * 0 
else b «• (RA) 

EA <- b + (RB) 

RESERVE «- 1 

RESERVE_ADDR «- func(EA) 

RT «- MEM(EA, 8) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). The doubleword in storage addressed 
by EA is loaded into RT. 

This instruction creates a reservation for use by a 
Store Doubleword Conditional instruction. An 
address computed from the EA is associated with the 
reservation, and replaces any address previously 
associated with the reservation: the manner in which 
the address to be associated with the reservation is 
computed from the EA is described in Part 2, 
“PowerPC Virtual Environment Architecture” on 
page 117. 

EA must be a multiple of 8. If it is not, the system 
alignment error handler may be invoked or the results 
may be boundedly undefined. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

None 
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Store Word Conditional Indexed X-form 


stwcx. RS,RA,RB 


31 

RS 

RA 

RB 

150 

1 

0 

6 

ii 

16 

21 

31 


if RA = 0 then b «- 0 
else b *- (RA) 

EA «- b + (RB) 
if RESERVE then 
MEM(EA, 4) «- (RS) 32- 6 3 
RESERVE «- 0 

CR0 «■ 0b00 || 0bl || XER so 
else 

CR0 «- 0b00 || 0b0 || XER so 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If a reservation exists, (RS ) 326 3 is stored into the 
word in storage addressed by EA and the reservation 
is cleared. 

If a reservation does not exist, the instruction com¬ 
pletes without altering storage. 

CR Field 0 is set to reflect whether the store opera¬ 
tion was performed (i.e., whether a reservation 
existed when the stwcx. instruction commenced exe¬ 
cution), as follows. 

CR0 L t gt E q so = ObOO II store performed || XER so 

EA must be a multiple of 4. If it is not, the system 
alignment error handler may be invoked or the results 
may be boundedly undefined. 

Special Registers Altered: 

CRO 


Store Doubleword Conditional Indexed 
X-form 


stdcx. RS,RA,RB 
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if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + (RB) 
if RESERVE then 
MEM(EA, 8) 4- (RS) 

RESERVE «- 0 

CR0 <- 0bG0 || Obi || XER so 
else 

CRO <- 0b00 || 0b0 || XER so 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If a reservation exists, (RS) is stored into the 
doubleword in storage addressed by EA and the res¬ 
ervation is cleared. 

If a reservation does not exist, the instruction com¬ 
pletes without altering storage. 

CR Field 0 is set to reflect whether the store opera¬ 
tion was performed (i.e., whether a reservation 
existed when the stdcx. instruction commenced exe¬ 
cution), as follows. 

CRO lt gt eq so = ObOO || store_performed || XER so 

EA must be a multiple of 8. If it is not, the system 
alignment error handler may be invoked or the results 
may be boundedly undefined. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO 


- Programming Note - 

The granularity with which reservations are 
managed is implementation-dependent. Therefore 
the storage to be accessed by the Load And 
Reserve and Store Conditional instructions should 
be allocated by a system library program. Addi¬ 
tional information can be found in Part 2, 
“PowerPC Virtual Environment Architecture” on 
page 117. 
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- Programming Note - 

When correctly used, the Load And Reserve and 
Store Conditional instructions can provide an 
atomic update function for a single aligned word 
(Load Word And Reserve and Store Word Condi¬ 
tional) or doubleword ( Load Doubleword And 
Reserve and Store Doubleword Conditional) of 
storage. 

One of the requirements for correct use is that 
Load Word And Reserve be paired with Store 
Word Conditional, and Load Doubleword And 
Reserve with Store Doubleword Conditional, with 
the same effective address used for both 
instructions of the pair. Examples of correct uses 
of these instructions, to emulate primitives such 
as “Fetch and Add,” “Test and Set,” and 
“Compare and Swap,” can be found in Appendix 
E.1, “Synchronization” on page 243. 

At most one reservation exists on any given 
processor: there are not separate reservations for 
words and for doublewords. 

The conditionality of the Store Conditional 
instruction's store is based only on whether a res¬ 
ervation exists, not on a match between the 
address associated with the reservation and the 
address computed from the EA of the Store Con¬ 
ditional instruction. 

A reservation is cleared if any of the following 
events occurs. 

■ The processor holding the reservation exe¬ 
cutes another Load And Reserve instruction; 
this clears the first reservation and estab¬ 
lishes a new one. 

■ The processor holding the reservation exe¬ 
cutes a Store Conditional instruction to any 
address. 

■ Another processor executes any Store 
instruction to the address associated with the 
reservation. 

■ Any mechanism, other than the processor 
holding the reservation, stores to the address 
associated with the reservation. 

See Part 2, “PowerPC Virtual Environment 
Architecture” on page 117, for additional informa¬ 
tion. 


Synchronize X-form 


sync 

[Power mnemonic: dcs] 
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The sync instruction provides an ordering function for 
the effects of all instructions executed by a given 
processor. Executing a sync instruction ensures that 
all instructions previously initiated by the given 
processor appear to have completed before the sync 
instruction completes, and that no subsequent 
instructions are initiated by the given processor until 
after the sync instruction completes. When the sync 
instruction completes, all storage accesses initiated 
by the given processor prior to the sync will have 
been performed with respect to all other mechanisms 
that access storage. (See Part 2, “PowerPC Virtual 
Environment Architecture” on page 117, for a more 
complete description. See also the section entitled 
“Table Update Synchronization Requirements” in 
Part 3, “PowerPC Operating Environment 

Architecture” on page 141, for an exception involving 
TLB invalidates.) 

This instruction is execution synchronizing (see 
Part 3, “PowerPC Operating Environment 

Architecture” on page 141). 

Special Registers Altered: 

None 


- Programming Note - 

The sync instruction can be used to ensure that 
the results of all stores into a data structure, per¬ 
formed in a “critical section” of a program, are 
seen by other processors before the data struc¬ 
ture is seen as unlocked. 

The functions performed by the sync instruction 
will normally take a significant amount of time to 
complete, so indiscriminate use of this instruction 
may adversely affect performance. In addition, 
the time required to execute sync may vary from 
one execution to another. 

The Enforce In-order Execution of I/O (e/e/o) 
instruction, described in Part 2, “PowerPC Virtual 
Environment Architecture” on page 117, may be 
more appropriate than sync for cases in which the 
only requirement is to control the order in which 
storage references are seen by I/O devices. 
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3.3.8 Other Fixed-Point Instructions 


The remainder of the fixed-point instructions use the 
content of the General Purpose Registers (GPRs) as 
source operands, and place results into GPRs, into the 
fixed-point Exception Register (XER), and into Condi¬ 
tion Register fields. In addition, the Trap instructions 
compare the contents of one GPR with a second GPR 
or immediate data and, if the conditions are met, 
invoke the system trap handler. 

These instructions treat the source operands as 
signed integers unless the instruction is explicitly 
identified as performing an unsigned operation. 

The X-form and XO-form instructions with Rc = 1, and 
the D-form instructions addic., andi., and andis., set 
the first three bits of CR Field 0 to characterize the 


result placed into the target register. In 64-bit mode, 
these bits are set as if the 64-bit result were com¬ 
pared algebraically to zero. In 32-bit mode, these bits 
are set as if the low-order 32 bits of the result were 
compared algebraically to zero. 

Unless otherwise noted and when appropriate, when 
CR Field 0 and the XER are set they reflect the value 
placed in the target register. 

- Programming Note - 

Instructions with the OE bit set or which set CA 
may execute slowly or may prevent the execution 
of subsequent instructions until the operation is 
completed. 
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3.3.9 Fixed-Point Arithmetic Instructions 


The XO-form Arithmetic instructions with Rc=1, and 
the D-form Arithmetic instruction addic., set the first 
three bits of CR Field 0 as described in Section 3.3.8, 
“Other Fixed-Point Instructions” on page 49. 

addic, addic., subfic, addc, subfc, adde, subfe, addme, 
subfme, addze, and subfze always set CA, to reflect 
the carry out of bit 0 in 64-bit mode and out of bit 32 
in 32-bit mode. With the exception of mulld and 
mullw, the XO-form Arithmetic instructions set SO and 
OV when OE= 1, to reflect overflow of the 64-bit result 
in 64-bit mode and overflow of the low-order 32-bit 
result in 32-bit mode, mulld and mullw set SO and OV 
when OE=1, to reflect overflow of the 64-bit result for 
mulld and overflow of the low-order 32-bit result for 
mullw. 

- Programming Note - 

Notice that CR Field 0 may not reflect the “true" 

(infinitely precise) result if overflow occurs. 


Extended mnemonics for addition and 
subtraction 

Several extended mnemonics are provided that use 
the Add Immediate and Add Immediate Shifted 
instructions to load an immediate value or an address 
into a target register. Some of these are shown as 
examples with the two instructions. 

The PowerPC Architecture supplies Subtract From 
instructions, which subtract the second operand from 
the third. A set of extended mnemonics is provided 
that use the more “normal" order, in which the third 
operand is subtracted from the second, with the third 
operand being either an immediate field or a register. 
Some of these are shown as examples with the appro¬ 
priate Add and Subtract From instructions. 

See Appendix C, “Assembler Extended Mnemonics” 
on page 223 for additional extended mnemonics. 


Add Immediate D-form 


addi RT,RA,SI 

[Power mnemonic: cal] 


14 

RT 

RA 


SI 


0 

6 

it 

16 


31 


Add Immediate Shifted D-form 


addis RT,RA,SI 
[Power mnemonic: cau] 


15 

RT 

RA 


SI 


0 

6 

ii 

16 


31 


if RA = 0 then RT <- EXTS(SI) 
else RT <- (RA) + EXTS(SI) 


if RA = 0 then RT <- EXTS(SI || 16 0) 
else RT <- (RA) + EXTS(SI || 16 0) 


The sum (RA|0) + SI is placed into register RT. 

Special Registers Altered: 

None 


Extended Mnemonics: 

Examples of extended mnemonics for Add Immediate: 


Extended: 

li Rx,value 

la Rx.disp(Ry) 

subi Rx,Ry, value 


Equivalent to: 

addi Rx,0, value 

addi Rx,Ry,disp 
addi Rx,Ry,—value 


— Programming Note - 

addi, addis, add, and subf are the preferred 
instructions for addition and subtraction, because 
they set few status bits. 

Notice that addi and addis use the value 0, not the 
contents of GPR 0, if RA = 0. 


The sum (RA{0) + (SI || 0x0000) is placed into register 
RT. 


Special Registers Altered: 

None 

Extended Mnemonics: 

Examples of extended mnemonics for Add Immediate 
Shifted: 


Extended: 

lis Rx,value 

subis Rx,Ry, value 


Equivalent to: 

addis Rx,0,value 
addis Rx,Ry,—value 
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Add XO-form 


Subtract From XO-form 


add 

RT,RA,RB 

add. 

RT,RA,RB 

addo 

RT,RA,RB 

addo. 

RT,RA,RB 


(OE = 0 Rc = 0) 
(OE = 0 Rc= 1) 
(0E= 1 Rc = 0) 
(0E = 1 Rc = 1) 


[Power mnemonics: cax, cax., caxo, caxo.] 


31 

RT 

RA 

RB 

OE 

266 

Rc 

0 

6 

11 

16 

21 

22 

31 


RT (RA) + (RB) 

The sum (RA) + (RB) is placed into register RT. 

Special Registers Altered: 

CRO (if Rc = 1) 

SOOV (ifOE-1) 


subf 

RT,RA,RB 

(OE = 0 Rc = 0) 

subf. 

RT.RA.RB 

(OE = 0 Rc = 1) 

subfo 

RT,RA,RB 

(OE = 1 Rc = 0) 

subfo. 

RT,RA,RB 

(OE= 1 Rc = 1) 


31 

RT 

RA 

RB 

OE 

40 

Rc 

0 

6 

ii 

16 

21 

22 

31 


RT «• -(RA) + (RB) + 1 

The sum “’(RA) + (RB) +1 is placed into register 
RT. 

Special Registers Altered: 

CRO (if Rc = 1) 

SOOV (if OE= 1) 

Extended Mnemonics: 

Example of extended mnemonics for Subtract From: 


Extended: 
sub Rx,Ry,Rz 


Equivalent to: 
subf Rx,Rz,Ry 


Add Immediate Carrying D-form 

addic RT,RA,SI 
[Power mnemonic: ai] 


12 

RT 

RA 


SI 


0 

6 

ii 

16 


31 


RT <- (RA) + EXTS(SI) 

The sum (RA) + SI is placed into register RT. 

Special Registers Altered: 

CA 

Extended Mnemonics: 

Example of extended mnemonics for Add Immediate 
Carrying: 

Extended: Equivalent to: 

subic Rx,Ry, value addic Rx,Ry,—value 


Add Immediate Carrying and Record 
D-form 

addic. RT.RA.SI 
[Power mnemonic: ai.] 


13 

RT 

RA 


SI 


0 

6 

ii 

16 


31 


RT «- (RA) + EXTS(SI) 

The sum (RA) + SI is placed into register RT. 

Special Registers Altered: 

CRO CA 

Extended Mnemonics: 

Example of extended mnemonics for Add Immediate 
Carrying and Record : 

Extended: 

subic. Rx,Ry,value 


Equivalent to: 
addic. Rx,Ry,—value 
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Subtract From Immediate Carrying 
D-form 

subfic RT,RA,SI 
[Power mnemonic: sfi] 


08 

RT 

RA 


SI 


0 

6 

11 

16 


31 


RT *• -.(RA) + EXTS(SI) + 1 

The sum (RA) + SI + 1 is placed into register RT. 

Special Registers Altered: 

CA 


Add Carrying XO-form Subtract From Carrying XO-form 


addc 

RT,RA,RB 

(OE = 0 Rc = 0) 

subfc 

RT,RA,RB 

(OE = 0 Rc = 0) 

addc. 

RT,RA,RB 

(OE = 0 Rc = 1) 

subfc. 

RT,RA,RB 

(OE = 0 Rc-1) 

addco 

RT,RA,RB 

(OE = 1 Rc=0) 

subfco 

RT.RA.RB 

(OE= 1 Rc = 0) 

addco. 

RT,RA,RB 

(OE= 1 Rc = 1) 

subfco. 

RT,RA,RB 

(OE= 1 Rc = 1) 


[Power mnemonics: a, a., ao, ao.] [Power mnemonics: sf, sf., sfo, sfo.] 
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RT 

RA 

RB 

OE 
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31 

RT 

RA 

RB 

OE 

8 

Rc 

0 
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ii 

16 
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22 
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RT <- (RA) + (RB) 

The sum (RA) + (RB) is placed into register RT. 

Special Registers Altered: 

CA 

CRO (if Rc = 1) 

SOOV (if OE = 1) 


RT <- -i(RA) + (RB) + 1 

The sum “’(RA) + (RB) + 1 is placed into register 
RT. 


Special Registers Altered: 

CA 

CRO (if Rc = 1) 

SOOV (if OE= 1) 

Extended Mnemonics: 

Example of extended mnemonics for Subtract From 
Carrying: 


Extended: 
subc Rx,Ry,Rz 


Equivalent to: 
subfc Rx,Rz,Ry 
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Add Extended XO-form Subtract From Extended XO-form 


adde 

RT.RA, RB 

(OE=0 Rc = 0) 

subfe 

RT.RA, RB 

(OE = 0 Rc = 0) 

adde. 

RT.RA, RB 

(OE = 0 Rc = 1) 

subfe. 

RT.RA, RB 

(OE = 0 Rc= 1) 

addeo 

RT.RA.RB 

(OE= 1 Rc = 0) 

subfeo 

RT.RA, RB 

(OE= 1 Rc = 0) 

addeo. 

RT.RA, RB 

(OE= 1 Rc = 1) 

subfeo. 

RT.RA, RB 

(OE= 1 Rc= 1) 


[Power mnemonics: ae, ae., aeo, aeo.] [Power mnemonics: sfe, sfe., sfeo. sfeo.] 



RT «- (RA) + (RB) + CA RT «- -<(RA) + (RB) + CA 

The sum (RA) + (RB) + CA is placed into register The sum -’(RA) + (RB) + CA is placed into register 
RT. RT. 


Special Registers Altered: 
CA 
CRO 
SO OV 


Special Registers Altered: 
CA 

(if Rc = 1) CRO 

(if OE = 1) SOOV 


(if Rc = 1) 
(if OE=1) 


Add To Minus One Extended XO-form 

addme RT.RA (OE = 0 Rc = 0) 

addme. RT.RA (OE = 0 Rc = t) 

addmeo RT.RA (OE=1 Rc = 0) 

addmeo. RT.RA (OE= 1 Rc = 1) 

[Power mnemonics: ame, ame., ameo, ameo.] 

31 I RT RA Til [oil 234 llRc 


Subtract From Minus One Extended 
XO-form 

subfme RT.RA (OE = 0 Rc = 0) 

subfme. RT.RA (OE = ORc = 1) 

subfmeo RT.RA (OE=1 Rc = 0) 

subfmeo. RT.RA (OE=1 Rc=1) 

[Power mnemonics: sfme, sfme., sfmeo, sfmeo.] 


Ill OE 


RT «- (RA) + CA - 1 

The sum (RA) + CA + 64 1 is placed into register RT. 


Special Registers Altered: 
CA 
CRO 
SO OV 


(if Rc = 1) 
(if OE = 1) 


RT <- -(RA) + CA - 1 

The sum “'(RA) + CA + 64 1 is placed into register 
RT. 


Special Registers Altered: 
CA 
CRO 
SO OV 


(if Rc= 1) 
(if OE = 1) 


Chapter 3. Fixed-Point Processor 53 



















Add To Zero Extended XO-form 


Subtract From Zero Extended XO-form 


addze 

RT,RA 

(OE=0 Rc = 0) 

addze. 

RT,RA 

(OE = 0 Rc = 1) 

addzeo 

RT,RA 

(OE — 1 Rc = 0) 

addzeo. 

RT,RA 

(OE= 1 Rc = 1) 


[Power mnemonics: aze, aze., azeo, azeo.] 
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subfze 

RT,RA 

(OE = 0 Rc = 0) 

subfze. 

RT,RA 

(OE = 0 Rc = 1) 

subfzeo 

RT,RA 

(OE — 1 Rc = 0) 

subfzeo. 

RT,RA 

(OE = 1 Rc= 1) 


[Power mnemonics: sfze, sfze., sfzeo, sfzeo.] 


31 

RT 

RA 

III 

OE 

200 

Rc 
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22 
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RT «- (RA) + CA 

The sum (RA) + CA is placed into register RT. 


RT <- -(RA) + CA 

The sum -, (RA) 4- CA is placed into register RT. 


Special Registers Altered: 
CA 
CRO 
SO OV 


Special Registers Altered: 
CA 

(if Rc = 1) CRO 

(if OE = 1) SOOV 


(if Rc = 1) 
(if OE= 1) 


— Programming Note - 

The setting of CA by the Add and Subtract 
instructions, including the Extended versions 
thereof, is mode-dependent. If a sequence of 
these instructions is used to perform extended- 
precision addition or subtraction, the same mode 
should be used throughout the sequence. 


Negate 

XO-form 


neg 

RT.RA 

(OE = 0 Rc = 0) 

neg. 

RT,RA 

(OE = 0 Rc = 1) 

nego 

RT,RA 

(OE = 1 Rc = 0) 

nego. 

RT,RA 

(OE = 1 Rc = 1) 


31 

RT 

RA 
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OE 
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RT <- -(RA) + 1 

The sum “•(RA) + 1 is placed into register RT. 

If executing in 64-bit mode and register RA contains 
the most negative 64-bit number (0x8000_0000_0000_ 
0000), the result is the most negative number and, if 
OE = 1, OV is set to 1. Similarly, if executing in 32-bit 
mode and (RA ) 326 3 contains the most negative 32-bit 
number (0x8000 0000), the low-order 32 bits of the 
result contain the most negative 32-bit number and, if 
OE = 1, OV is set to 1. 

Special Registers Altered: 

CRO (if Rc = 1) 

SOOV (if OE = 1) 
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Multiply Low Immediate D-form 


mulli RT,RA,SI 

[Power mnemonic: muli] 


07 

RT 

RA 


SI 


0 
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16 


31 


prod 0 ;i 27 «- (RA) x EXTS(SI) 

RT <- prod 64:127 

The 64-bit first operand is (RA). The 64-bit second 
operand is the sign-extended value of the SI field. 
The low-order 64 bits of the 128-bit product of the 
operands are placed into register RT. 

Both the operands and the product are interpreted as 
signed integers. 

Special Registers Altered: 

None 


Multiply Low Doubleword XO-form 


mulld RT,RA,RB 

mulld. RT,RA,RB 

mulldo RT,RA,RB 

mulldo. RT,RA,RB 


(OE = 0 Rc = 0) 
(OE = 0 Rc = 1) 
(OE = 1 Rc = 0) 
(OE= 1 Rc = 1) 


31 

RT 
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RB 

OE 
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Rc 
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prodo; 127 * (RA) x (RB) 

RT <- prod 64;127 

The 64-bit operands are (RA) and (RB). The low-order 
64 bits of the 128-bit product of the operands are 
placed into register RT. 

If OE = 1 then OV is set to 1 if the product cannot be 
represented in 64 bits. 

Both the operands and the product are interpreted as 
signed integers. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 

SOOV (if OE= 1) 

- Programming Note - 

The XO-form Multiply instructions may execute 
faster on some implementations if RB contains 
the operand having the smaller absolute value. 


Multiply Low Word 

mullw RT,RA,RB 

mullw. RT,RA,RB 

mullwo RT,RA,RB 

mullwo. RT,RA,RB 

[Power mnemonics: muls, mi 


XO-form 


(OE = 0 Rc = 0) 
(OE = 0 Rc = 1) 
(OE= 1 Rc = 0) 
(OE= 1 Rc = 1) 
mulso, mulso.] 
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RT «- (RA ) 32:63 x (RB ) 32;63 

The 32-bit operands are the low-order 32 bits of RA 
and of RB. The 64-bit product of the operands is 
placed into register RT. 

If OE = 1 then OV is set to 1 if the product cannot be 
represented in 32 bits. 

Both the operands and the product are interpreted as 
signed integers. 

Special Registers Altered: 

CRO (if Rc = 1) 

SOOV (if OE=1) 

- Programming Note - 

For mulli and mullw, the low-order 32 bits of the 
product are the correct 32-bit product for 32-bit 
mode. 

For mulli and mulld, the low-order 64 bits of the 
product are independent of whether the operands 
are regarded as signed or unsigned 64-bit inte¬ 
gers. For mulli and mullw, the low-order 32 bits 
of the product are independent of whether the 
operands are regarded as signed or unsigned 
32-bit integers. 
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Multiply High Doubleword XO-form 


Multiply High Word XO-form 


mulhd RT,RA,RB (Rc = 0) mulhw RT,RA,RB (Rc = 0) 

mulhd. RT,RA,RB (Rc-1) mulhw. RT,RA,RB (Rc=1) 

| 31 RT RA RB / 73 Rc 31 RT RA RB / 75 Rc 

to 6 11 16 21 22 31 0 6 11 16 21 22 31 


pr°d 0 ;i27 *■ (RA) x (RB) 

RT «■ prod 0:63 

The 64-bit operands are (RA) and (RB). The high- 
order 64 bits of the 128-bit product of the operands 
are placed into register RT. 

Both the operands and the product are interpreted as 
signed integers. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 

Multiply High Doubleword Unsigned 
XO-form 

mulhdu RT,RA,RB (Rc = 0) 

mulhdu. RT,RA,RB (Rc=1) 


31 RT RA RB / 9 Rc 



P r °d 0 ;i27 *• (RA) X (RB) 

RT «• prod 0;63 

The 64-bit operands are (RA) and (RB). The high- 
order 64 bits of the 128-bit product of the operands 
are placed into register RT. 

Both the operands and the product are interpreted as 
unsigned integers. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc-1) 


prod 0: g 3 «• (RA) 3 2;63 x (RB)32:63 
RT 32:63 * P r °d 0 ;3l 
RT 0:31 *■ undefined 

The 32-bit operands are the low-order 32 bits of RA 
and of RB. The high-order 32 bits of the 64-bit 
product of the operands are placed into RT 32:63 . 
(RT) 0;31 are undefined. 

Both the operands and the product are interpreted as 
signed integers. 

Special Registers Altered: 

CRO (if Rc-1) 


Multiply High Word Unsigned XO-form 


mulhwu RT,RA,RB (Rc = 0) 

mulhwu. RT,RA,RB (Rc-1) 



0 6 11 16 21 22 31 


prod 0:63 «- (RA) 32:63 x (RB) 3 2:63 
R^"32:63 *■ P r 0 d 0:31 
RTq 3 i <■ undefined 

The 32-bit operands are the low-order 32 bits of RA 
and of RB. The high-order 32 bits of the 64-bit 
product of the operands are placed into RT 3263 . 
(RT) 0;3 i are undefined. 

Both the operands and the product are interpreted as 
unsigned integers. 

Special Registers Altered: 

CRO (if Rc-1) 
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Divide Doubleword XO-form 


Divide Word XO-form 


divd 

RT,RA,RB 

(OE = 0 Rc = 0) 

divd. 

RT,RA,RB 

(OE = 0 Rc = 1) 

divdo 

RT,RA,RB 

(OE = 1 Rc = 0) 

divdo. 

RT,RA,RB 

(OE = 1 Rc = 1) 


31 

RT 

RA 

RB 

OE 

489 

Rc 
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31 


dividend 0:63 «- (RA) 
divisor 0;63 *- (RB) 

RT *- dividend + divisor 

The 64-bit dividend is (RA). The 64-bit divisor is (RB). 
The 64-bit quotient of the dividend and divisor is 
placed into RT. The remainder is not supplied as a 
result. 

Both the dividend and the divisor are interpreted as 
signed integers. The quotient is the unique signed 
integer that satisfies 

dividend = (quotient x divisor) + r 

where 0 < r < \divisor\ if the dividend is nonnegative, 
and — |d/Wsor| < r < 0 if the dividend is negative. 

If an attempt is made to perform any of the divisions 

0x8000_0000_0Q00_0000 + -1 
<anything> * 0 

then the contents of register RT are undefined as are 
(if Rc = 1) the contents of the LT, GT, and EQ bits of 
CR Field 0. In these cases, if OE=1 then OV is set to 
1. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc —1) 

SOOV (if OE = 1) 

- Programming Note - 

The 64-bit signed remainder of dividing (RA) by 
(RB) can be computed as follows, except in the 
case that (RA) = —2 63 and (RB) = —1. 

divd RT,RA,RB # RT = quotient 

mulld RT,RT,RB # RT = quotient*divisor 

subf RT,RT,RA # RT = remainder 


divw 

RT,RA,RB 

(OE = 0 Rc = 0) 

divw. 

RT,RA,RB 

(OE = 0 Rc = 1) 

divwo 

RT,RA,RB 

(OE= 1 Rc = 0) 

divwo. 

RT,RA,RB 

(OE = 1 Rc = 1) 


31 

RT 
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RB 

OE 
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dividend 0:63 «- EXTS((RA) 32;63 ) 
divisor oe3 «- EXTS((RB) 32;63 ) 

RT 32:63 «- dividend * divisor 
RT 0 .3i * undefined 

The 64-bit dividend is the sign-extended value of 
(RA) 32 63 . The 64-bit divisor is the sign-extended 
value of (RB) 32:63 . The 64-bit quotient is formed. The 
low-order 32 bits of the 64-bit quotient are placed into 
^ 32 : 63 - (RT)o: 3 i are undefined. The remainder is not 
supplied as a result. 

Both the dividend and the divisor are interpreted as 
signed integers. The quotient is the unique signed 
integer that satisfies 

dividend = (quotient x divisor) + r 

where 0 <r< \divisor\ if the dividend is nonnegative, 
and — |cf/V/sor| < r < 0 if the dividend is negative. 

If an attempt is made to perform any of the divisions 

0X8000_0000 - -1 
<anything> * 0 

then the contents of register RT are undefined as are 
(if Rc = 1) the contents of the LT, GT, and EQ bits of 
CR Field 0. In these cases, if OE=1 then OV is set to 
1. 

Special Registers Altered: 

CRO (if Rc = 1) 

SOOV (if OE= 1) 

- Programming Note - 

The 32-bit signed remainder of dividing (RA) 3263 
by (RB) 32:63 can be computed as follows, except in 
the case that (RA) = —2 31 and (RB) = —1. 

divw RT,RA,RB # RT = quotient 

mullw RT,RT,RB # RT = quotient*divisor 

subf RT,RT,RA # RT = remainder 
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Divide Doubleword Unsigned XO-form 


Divide Word Unsigned XO-form 


divdu 

RT,RA,RB 

(OE = 0 Rc = 0) 

divwu 

RT.RA.RB 

(OE = 0 Rc = 0) 

divdu. 

RT,RA,RB 

(OE = 0 Rc = 1) 

divwu. 

RT.RA.RB 

(OE = 0 Rc —1) 

divduo 

RT,RA,RB 

(OE= 1 Rc = 0) 

divwuo 

RT,RA,RB 

(OE= 1 Rc = 0) 

divduo. 

RT,RA,RB 

(OE—1 Rc== 1) 

divwuo. 

RT,RA,RB 

(OE= 1 Rc= 1) 


31 RT RA RB OE 457 Rc 

0 6 11 16 21 22 31 


RT RA RB OE 459 Rc 


dividend 0:63 «- (RA) 
divisor 0:63 «■ (RB) 

RT «- dividend + divisor 

The 64-bit dividend is (RA). The 64-bit divisor is (RB). 
The 64-bit quotient of the dividend and divisor is 
placed into RT. The remainder is not supplied as a 
result. 

Both the dividend and the divisor are interpreted as 
unsigned integers. The quotient is the unique 
unsigned integer that satisfies 

dividend = (quotient x divisor) + r 

where 0 < r < divisor. 

If an attempt is made to perform the division 
<anything> * 0 


dividend 0;63 «- 32 0 || (RA) 32;63 
divisor 0:63 <- 32 0 || (RB) 32;63 
RT 3 2:63 *■ dividend + divisor 
RT 0:31 «- undefined 

The 64-bit dividend is the zero-extended value of 
(RA) 3 2 ; 63 - The 64-bit divisor is the zero-extended 
value of (RB) 32:63 . The 64-bit quotient is formed. The 
low-order 32 bits of the 64-bit quotient are placed into 
RT 32;63 . (RT) 031 are undefined. The remainder is not 
supplied as a result. 

Both the dividend and the divisor are interpreted as 
unsigned integers. The quotient is the unique 
unsigned integer that satisfies 

dividend = (quotient x divisor ) + r 

where 0 < r < divisor. 


then the contents of register RT are undefined as are 
(if Rc = 1) the contents of the LT, GT, and EQ bits of 
CR Field 0. In this case, if OE = 1 then OV is set to 1. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 


If an attempt is made to perform the division 
<anything> -5- 0 

then the contents of register RT are undefined as are 
(if Rc = 1) the contents of the LT, GT, and EQ bits of 
CR Field 0. In this case, if OE= 1 then OV is set to 1. 

Special Registers Altered: 

CRO (if Rc=1) 

SOOV (if OE=1) 


The 64-bit unsigned remainder of dividing (RA) by 
(RB) can be computed as follows. 


CRO 
SO OV 

Programming Note 


(if Rc=1) 
(if OE = 1) 


— Programming Note - ; - 

The 32-bit unsigned remainder of dividing 
(RA) 32:63 by (RB) 32;63 can be computed as follows. 


divdu RT,RA,RB 
mulld RT,RT,RB 
subf RT,RT, RA 


# RT * quotient 

# RT = quotient*di visor 

# RT = remainder 


divwu RT,RA,RB 
mullw RT,RT,RB 
subf RT,RT,RA 


# RT = quotient 

# RT = quotient*divisor 

# RT = remainder 
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3.3.10 Fixed-Point Compare Instructions 


The Fixed-Point Compare instructions algebraically or 
logically compare the contents of register RA with (1) 
the sign-extended value of the SI field, (2) the Ul field, 
or (3) the contents of register RB. Algebraic compar¬ 
ison compares two signed integers. Logical compar¬ 
ison compares two unsigned integers. 

For 64-bit implementations, the L field controls 
whether the operands are treated as 64- or 32-bit 
quantities, as follows: 

L Operand length 

0 32-bit operands 

1 64-bit operands 

When the operands are treated as 32-bit signed quan¬ 
tities, bit 32 of the register (RA or RB) is the sign bit. 

For 32-bit implementations, the L field must be zero. 

The Compare instructions set one bit in the leftmost 
three bits of the designated CR field to one, and the 


other two to zero. XER so is copied into bit 3 of the 
designated CR field. 

The CR field is set as follows. 

Bit Name Description 

0 LT (RA) < SI, Ul, or (RB) 

1 GT (RA) > SI, Ul, or (RB) 

2 EO (RA) = SI, Ul, or (RB) 

3 SO Summary Overflow from the XER 

Extended mnemonics for compares 

A set of extended mnemonics is provided so that 
compares can be coded with the operand length as 
part of the instruction mnemonic rather than as a 
numeric operand. Some of these are shown as exam¬ 
ples with the Compare instructions. The extended 
mnemonics for doubleword comparisons are available 
only in 64-bit implementations. See Appendix C, 
“Assembler Extended Mnemonics” on page 223 for 
additional extended mnemonics. 


Compare Immediate D-form 


Compare X-form 


cmpi BF,L,RA,SI 


cmp BF,L,RA,RB 


11 


BF 


/L 


RA 


id ii 


SI 


16 


31 


31 

BF 

2 

3 

RA 

RB 

0 

/ 

0 

6 

1 

1 

ii 

16 

21 

31 


if L = 0 then a «- EXTS((RA) 32;63 ) 
else a «- (RA) 

if a < EXTS(SI) then c «- 0bl0Q 
else if a > EXTS(SI) then c «- 0b010 
else c «- 0b001 

CR 4xBF:4*BF + 3 c II XER so 

The contents of register RA ((RA) 32;63 sign-extended 
to 64 bits if L=0) is compared with the sign-extended 
value of the SI field, treating the operands as signed 
integers. The result of the comparison is placed into 
CR field BF. 

In 32-bit implementations, if L—1 the instruction form 
is invalid. 

Special Registers Altered: 

CR field BF 

Extended Mnemonics: 

Examples of extended mnemonics for Compare Imme¬ 
diate: 

Extended: Equivalent to: 

cmpdi Rx,value cmpi 0,1,Rx,value 

cmpwi cr3,Rx,value cmpi 3,0,Rx,value 


if L = 0 then a *- EXTS((RA) 32 . 63 ) 

b «- exts((rb) 32 : 63 ) 

else a «- (RA) 
b «- (RB) 

if a < b then c «- 0bl00 

else if a > b then c <- 0b010 
else c «- 0bO01 

CR 4xBF:4xBF+3 * C II XERso 

The contents of register RA ((RA) 32;63 if L = 0) is com¬ 
pared with the contents of register RB ((RB) 32:63 if 
L = 0), treating the operands as signed integers. The 
result of the comparison is placed into CR field BF. 

In 32-bit implementations, if L= 1 the instruction form 
is invalid. 

Special Registers Altered: 

CR field BF 

Extended Mnemonics: 

Examples of extended mnemonics for Compare: 

Extended: Equivalent to: 

cmpd Rx,Ry cmp 0,1,Rx,Ry 

cmpw cr3,Rx,Ry cmp 3,0,Rx,Ry 
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Compare Logical Immediate D-form Compare Logical X-form 

cmpli BF,L,RA,UI cmpl BF,L,RA,RB 


31 

BF 

2 

3 

RA 

RB 

32 

/ 

0 

6 

1 

1 


16 

21 

31 


10 

BF 

2 

3 

RA 

UI 


0 

6 

1 

E 

it 

16 

31 


if L = 0 then a «■ 32 0 || (RA) 32;63 
else a «- (RA) 

if a <: ( 48 0 || UI) then c «- 0bl00 
else if a £ ( 48 0 |j UI) then c *■ 0b010 
else c «- 0b001 

CR 4 xBF;4xBF + 3 c II XER S0 

The contents of register RA ((RA) 32;63 zero-extended 
to 64 bits if L=0) is compared with 48 0 || UI, treating 
the operands as unsigned integers. The result of the 
comparison is placed into CR field BF. 

In 32-bit implementations, if L=1 the instruction form 
is invalid. 


if L = 0 then a «- 32 0 || (RA) 32:63 
b *■ 32 0 || (RB) 32;63 
else a «- (RA) 
b «- (RB) 

if a £ b then c <- 0bI00 
else if a £ b then c «- 0b010 
else c «- 0b001 

^4*BF:4xBF + 3 C II XER so 

The contents of register RA ((RA) 32;63 if L=0) is com¬ 
pared with the contents of register RB ((RB) 32;63 if 
L = 0), treating the operands as unsigned integers. 
The result of the comparison is placed into CR field 
BF. 


In 32-bit implementations, if L=1 the instruction form 
is invalid. 


Special Registers Altered: 
CR field BF 

Extended Mnemonics: 

Examples of extended 
Logical Immediate: 

Extended: 

cmpldi Rx,value 
cmplwi cr3,Rx,value 


mnemonics for Compare 

Equivalent to: 

cmpli 0,1,Rx, value 
cmpli 3,0, Rx, value 


Special Registers Altered: 
CR field BF 

Extended Mnemonics: 

Examples of extended 
Logical: 

Extended: 

cmpld Rx,Ry 
cmplw cr3,Rx,Ry 


mnemonics for Compare 

Equivalent to: 

cmpl 0,1,Rx,Ry 
cmpl 3,0,Rx,Ry 
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3.3.11 Fixed-Point Trap Instructions 


The Trap instructions are provided to test for a speci¬ 
fied set of conditions. If any of the conditions tested 
by a Trap instruction are met, the system trap handler 
is invoked. If the tested conditions are not met, 
instruction execution continues normally. 

The contents of register RA is compared with either 
the sign-extended value of the SI field or the contents 
of register RB, depending on the Trap instruction. For 
tdi and td, the entire contents of RA (and RB) partic¬ 
ipate in the comparison; for twi and tw, only the con¬ 
tents of the low-order 32 bits of RA (and RB) 
participate in the comparison. 

This comparison results in five conditions which are 
ANDed with TO. If the result is not 0 the system trap 
handler is invoked. These conditions are: 


TO bit ANDed with Condition 
0 Less Than 

1 Greater Than 

2 Equal 

3 Logically Less Than 

4 Logically Greater Than 

Extended mnemonics for traps 

A set of extended mnemonics is provided so that 
traps can be coded with the condition as part of the 
instruction mnemonic rather than as a numeric 
operand. Some of these are shown as examples with 
the Trap instructions. See Appendix C, “Assembler 
Extended Mnemonics” on page 223 for additional 
extended mnemonics. 


Trap Doubleword Immediate D-form Trap Word Immediate D-form 


tdi TO,RA,SI 


02 

TO 

RA 


SI 


0 

6 

ii 

16 


31 


a <- (RA) 

if (a < EXTS(SI)) & T0 o then TRAP 

if (a > EXTS(SI)) & JO, then TRAP 

if (a = EXTS(SI)) & T0 2 then TRAP 

if (a EXTS(SI)) & T0 3 then TRAP 

if (a :> EXTS(SI)) & T0 4 then TRAP 

The contents of register RA is compared with the 
sign-extended value of the SI field. If any bit in the 
TO field is set to 1 and its corresponding condition is 
met by the result of the comparison, then the system 
trap handler is invoked. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 


twi TO,RA,SI 

[Power mnemonic: ti] 


03 

TO 

RA 


SI 


0 

6 

11 

16 


31 


a <- EXTS((RA) 32 . 63 ) 
if (a < EXTS(SI)) & T0 0 then TRAP 

if (a > EXTS(SI)) & TC^ then TRAP 

if (a = EXTS(SI)) & T0 2 then TRAP 

if (a % EXTS(SI)) & T0 3 then TRAP 

if (a £ EXTS(SI)) & T0 4 then TRAP 

The contents of RA 32:6 3 is compared with the sign- 
extended value of the SI field. If any bit in the TO 
field is set to 1 and its corresponding condition is met 
by the result of the comparison, then the system trap 
handler is invoked. 

Special Registers Altered: 

None 


Special Registers Altered: 
None 


Extended Mnemonics: 

Examples of extended mnemonics for Trap Word 
Immediate: 


Extended Mnemonics: 

Examples of extended 
Doubleword Immediate: 

mnemonics for Trap 

Extended: 

twgti Rx,value 
twllei Rx,value 

Extended: 

Equivalent to: 



tdlti Rx,value 

tdi 16,Rx,value 



tdnei Rx,value 

tdi 24, Rx, value 




Equivalent to: 

twi 8,Rx,value 

twi 6,Rx,value 
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Trap Doubleword X-form 


Trap Word X-form 


td TO,RA,RB 



0 6 11 16 121 31 


a «■ (RA) 
b «- (RB) 

if (a < b) & T0 o then TRAP 

if (a > b) & TO! then TRAP 

if (a = b) & T0 2 then TRAP 

if (a ^ b) & T0 3 then TRAP 

if (a * b) & T0 4 then TRAP 

The contents of register RA is compared with the con¬ 
tents of register RB. If any bit in the TO field is set to 
1 and its corresponding condition is met by the result 
of the comparison, then the system trap handler is 
invoked. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation wMI cause 
the system illegal instruction error hand> to be 
invoked. 

Special Registers Altered: 

None 

Extended Mnemonics: 

Examples of extended mnemonics for Trap 
Doubleword: 

Extended: Equivalent to: 

tdge Rx,Ry td 12,Rx,Ry 

tdlnl Rx,Ry td 5,Rx,Ry 


tw TO,RA,RB 

[Power mnemonic: t] 


31 TO RA RB 4 / 



a <- EXTS((RA) 32 . 63 ) 
b <- EXTS( (RB) 32 63 ) 
if (a < b) & T0 o then TRAP 

if (a > b) & TO, then TRAP 

if (a = b) & T0 2 then TRAP 

if (a ^ b) & T0 3 then TRAP 

if (a * b) & T0 4 then TRAP 

The contents of RA 32;63 is compared with the contents 
of RB 32:63 . If any bit in the TO field is set to 1 and its 
corresponding condition is met by the result of the 
comparison, then the system trap handler is invoked. 

Special Registers Altered: 

None 

Extended Mnemonics: 

Examples of extended mnemonics for Trap Word: 

Extended: Equivalent to: 

tweq Rx,Ry tw 4,Rx,Ry 

twlge Rx,Ry tw 5,Rx,Ry 

trap tw 31,0,0 
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3.3.12 Fixed-Point Logical Instructions 


The Logical instructions perform bit-parallel oper¬ 
ations on 64-bit operands. 

The X-form Logical instructions with Rc=1, and the 
D-form Logical instructions andi. and andis., set the 
first three bits of CR Field 0 as described in Section 
3.3.8, “Other Fixed-Point Instructions” on page 49. 
The Logical instructions do not change the SO, OV, 
and CA bits in the XER. 


Extended mnemonics for logical 
operations 

An extended mnemonic is provided that generates the 
preferred form of “no-op” (an instruction that does 
nothing). This is shown as an example with the OR 
Immediate instruction. 

Extended mnemonics are provided that use the OR 
and NOR instructions to copy the contents of one reg¬ 
ister to another, with and without complementing. 
These are shown as examples with the two 
instructions. 

See Appendix C, “Assembler Extended Mnemonics” 
on page 223 for additional extended mnemonics. 


AND Immediate D-form 

andi. RA,RS,UI 

[Power mnemonic: andit.] 


28 

RS 

RA 


UI 


0 

6 


16 


31 


RA «■ (RS) & ( 48 0 || UI) 

The contents of register RS is ANDed with 48 0 || UI and 
the result is placed into register RA. 

Special Registers Altered: 

CRO 


AND Immediate Shifted D-form 

andis. RA,RS,UI 
[Power mnemonic: andiu.] 


29 

RS 

RA 


UI 


0 

6 

11 

16 


31 


RA «- (RS) & ( 32 0 || UI || 16 0) 

The contents of register RS is ANDed with 32 0 || UI || 
16 0 and the result is placed into register RA. 

Special Registers Altered: 

CRO 
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OR Immediate D-form 

ori RA,RS,UI 

[Power mnemonic: oril] 


24 

RS 

RA 


UI 


0 

6 


16 


31 


RA «- (RS) I ( 48 0 || UI) 

The contents of register RS is ORed with 48 0 || UI and 
the result is placed into register RA. 

The preferred “no-op” (an instruction that does 
nothing) is: 

ori 0,0,0 

Special Registers Altered: 

None 

Extended Mnemonics: 

Example of extended mnemonics for OR Immediate: 

Extended: Equivalent to: 

nop ori 0,0,0 


OR Immediate Shifted D-form 

oris RA.RS.UI 

[Power mnemonic: oriu] 


25 

RS 

RA 


UI 


0 

6 

ii 

16 


31 


RA <- (RS) I ( 32 0 || UI || 16 0) 

The contents of register RS is ORed with 32 0 || UI || 16 0 
and the result is placed into register RA. 

Special Registers Altered: 

None 


XOR Immediate D-form 


XOR Immediate Shifted D-form 


xori RA, RS,UI 

[Power mnemonic: xoril] 


26 

RS 

RA 


UI 


0 

6 

ii 

16 


31 


xoris RA.RS.UI 

[Power mnemonic: xoriu] 


27 

RS 

RA 


UI 


0 

6 

ii 

16 


31 


RA <- (RS) © ( 48 0 || UI) 

The contents of register RS is XORed with 48 0 || UI 
and the result is placed into register RA. 

Special Registers Altered: 

None 


RA <- (RS) © ( 32 0 || UI || 16 0) 

The contents of register RS is XORed with 32 0 || UI || 
16 0 and the result is placed into register RA. 

Special Registers Altered: 

None 
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AND X-form 


and RA,RS,RB (Rc-O) 

and. RA,RS,RB (Rc=1) 


31 

RS 

RA 

RB 

28 

Rc 

0 

6 

11 

16 

21 

31 


RA «- (RS) & (RB) 

The contents of register RS is ANDed with the con¬ 
tents of register RB and the result is placed into reg¬ 
ister RA. 

Special Registers Altered: 

CRO (if Rc = 1) 


XOR X-form 


xor RA,RS,RB (Rc = 0) 

xor. RA,RS,RB (Rc = 1) 


31 

RS 

RA 

RB 

316 

Rc 

0 

6 

ii 

16 

21 

31 


RA 4 - (RS) © (RB) 

The contents of register RS is XORed with the con¬ 
tents of register RB and the result is placed into reg¬ 
ister RA. 

Special Registers Altered: 

CRO (if Rc = 1) 


OR X-form 

or RA,RS,RB (Rc = 0) 

or. RA,RS,RB (Rc = 1) 


31 

RS 

RA 

RB 

444 

Rc 

0 

6 

ii 

16 

21 

31 


RA 4- (RS) I (RB) 

The contents of register RS is ORed with the contents 
of register RB and the result is placed into register 
RA. 

Special Registers Altered: 

CRO (if Rc = 1) 

Extended Mnemonics: 

Example of extended mnemonics for OR: 

Extended: Equivalent to: 

mr Rx,Ry or Rx,Ry,Ry 


NAND X-form 

nand RA,RS,RB (Rc = 0) 

nand. RA,RS,RB (Rc=1) 


31 

RS 

RA 

RB 

476 

Rc 

0 

6 

ii 

16 

21 

31 


RA 4- -((RS) & (RB)) 

The contents of register RS is ANDed with the con¬ 
tents of register RB and the complemented result is 
placed into register RA. 

Special Registers Altered: 

CRO (if Rc = 1) 

- Programming Note - 

nand or nor with RA=RB can be used to obtain 
the one's complement. 
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NOR X-form 


Equivalent X-form 


RA,RS,RB 

RA.RS.RB 


(Rc = 0) 
(Rc-1) 


RA,RS,RB 

RA,RS,RB 


(Rc —0) 
(Rc-1) 


31 

RS 

RA 

RB 

124 

Rc 

0 

6 

it 

16 

21 

31 



RA <- -«((RS) I (RB)) 

The contents of register RS is ORed with the contents 
of register RB and the complemented result is placed 
into register RA. 


Special Registers Altered: 

CRO 

Extended Mnemonics: 

Example of extended mnemonics for NOR: 


(if Rc-1) 


Extended: 
not Rx,Ry 


Equivalent to: 
nor Rx,Ry,Ry 


RA <- (RS) = (RB) 

The contents of register RS is XORed with the con¬ 
tents of register RB and the complemented result is 
placed into register RA. 


Special Registers Altered: 
CRO 


(if Rc = 1) 


AND with Complement X-form 

andc RA,RS,RB 

andc. RA,RS,RB 


(Rc = 0) 
(Rc-1) 


OR with Complement X-form 

ore RA,RS,RB 

ore. RA,RS,RB 


(Rc = 0) 

(Rc-1) 



RA «- (RS) & -(RB) 

The contents of register RS is ANDed with the com¬ 
plement of the contents of register RB and the result 
is placed into register RA. 


Special Registers Altered: 
CRO 


(if Rc-1) 


RA «- (RS) I -(RB) 

The contents of register RS is ORed with the comple¬ 
ment of the contents of register RB and the result is 
placed into register RA. 


Special Registers Altered: 
CRO 


(if Rc-1) 


66 PowerPC Architecture First Edition 


























Extend Sign Byte X-form 


extsb RA,RS (Rc = 0) 

extsb. RA,RS (Rc = 1) 


31 

RS 

RA 

III 

954 

Rc 

0 

6 

11 

16 

21 

31 


s *■ (RS) 56 

RA 56:63 * J RS )56:63 
RA 0:55 * 56 S 

(RS ) 56 63 are placed into RA 5663 - Bit 56 of register RS 
is placed into RA 0;55 . 

Special Registers Altered: 

CRO (if Rc = 1) 


Extend Sign Halfword X-form 


extsh RA,RS (Rc = 0) 

extsh. RA,RS (Rc=1) 

[Power mnemonics: exts, exts.] 


31 

RS 

RA 

III 

922 

Rc 

0 

6 

11 

16 

21 

31 


s «- (RS) 48 
RA 48:63 ( R ^)48:63 

RA 0:47 48 s 

(RS) 48;63 are placed into RA 48;63 . Bit 48 of register RS 
is placed into RA 0:47 . 

Special Registers Altered: 

CRO (if Rc = 1) 


Extend Sign Word X-form 

extsw RA,RS (Rc = 0) 

extsw. RA.RS (Rc = 1) 


31 

RS 

RA 

III 

986 

Rc 

0 

6 

11 

16 

21 

31 


s <- (RS) 32 

RA 32:63 ( R S) 32:63 

RA 0 ; 3 i <- 32 S 

(RS) 32 63 are placed into RA 32:6 3 . Bit 32 of register RS 
is placed into RA 0:31 . 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 
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Count Leading Zeros Doubleword 
X-form 


cntlzd RA,RS (Rc = 0) 

cntlzd. RA,RS (Rc = 1) 


31 

RS 

RA 

III 

58 

Rc 

0 

6 

ii 

16 

21 

31 


n <- 0 

do while n < 64 

if (RS) n = 1 then leave 
n «- n + 1 
RA n 

A count of the number of consecutive zero bits 
starting at bit 0 of register RS is placed into RA. This 
number ranges from 0 to 64, inclusive. 

If Rc = 1, CR Field 0 is set to reflect the result. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 


Count Leading Zeros Word X-form 


cntlzw RA.RS (Rc = 0) 

cntlzw. RA.RS (Rc = 1) 

[Power mnemonics: cntlz, cntlz.] 


31 

RS 

RA 

III 

26 

Rc 

0 

6 

ii 

16 

21 

31 


n «- 32 

do while n < 64 
if (RS) n = 1 then leave 
n «- n + 1 
RA «- n - 32 

A count of the number of consecutive zero bits 
starting at bit 32 of of register RS is placed into RA. 
This number ranges from 0 to 32, inclusive. 

If Rc = 1, CR Field 0 is set to reflect the result. 

Special Registers Altered: 

CRO (if Rc = 1) 

- Programming Note - 

For both Count Leading Zeros instructions, if 
Rc= 1 then LT is set to zero in CR Field 0. 
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3.3.13 Fixed-Point Rotate and Shift Instructions 


The Fixed-Point Processor performs rotation oper¬ 
ations on data from a GPR and returns the result, or a 
portion of the result, to a GPR. 

The rotation operations rotate a 64-bit quantity left by 
a specified number of bit positions. Bits that exit from 
position 0 enter at position 63. 

Two types of rotation operation are supported. 

For the first type, denoted rotate 64 or ROTL^, the 
value rotated is the given 64-bit value. The rotate 64 
operation is used to rotate a given 64-bit quantity. 

For the second type, denoted rotate 32 or ROTL 32 , the 
value rotated consists of two copies of bits 32:63 of 
the given 64-bit value, one copy in bits 0:31 and the 
other in bits 32:63. The rotate 32 operation is used to 
rotate a given 32-bit quantity. 

The Rotate and Shift instructions employ a mask gen¬ 
erator. The mask is 64 bits long, and consists of 
1 -bits from a start bit, mstart, through and including a 
stop bit, mstop, and 0-bits elsewhere. The values of 
mstart and mstop range from zero to 63. If mstart > 
mstop, the 1-bits wrap around from position 63 to 
position 0. Thus the mask is formed as follows: 

if mstart s mstop then 

mask mstart:mstop = ° na S 
mask all other bits “ zeros 

else 

mask m start: 63 = ones 
mask 0:rnstop = one s 
mask all other bits " zeros 

There is no way to specify an all-zero mask. 

For instructions that use the rotate 32 operation, the 
mask start and stop positions are always in the low- 
order 32 bits of the register. 

The use of the mask is described in following 
sections. 


The Rotate and Shift instructions with Rc = 1 set the 
first three bits of CR field 0 as described in Section 
3.3.8, “Other Fixed-Point Instructions” on page 49. 
Rotate and Shift instructions do not change the OV 
and SO bits. Rotate and Shift instructions, except 
algebraic right shifts, do not change the CA bit. 

Extended mnemonics for rotates and 
shifts 

The Rotate and Shift instructions, while powerful, can 
be complicated to code (they have up to five oper¬ 
ands). A set of extended mnemonics is provided that 
allow simpler coding of often-used functions such as 
clearing the leftmost or rightmost bits of a register, 
left justifying or right justifying an arbitrary field, and 
simple rotates and shifts. Some of these are shown 
as examples with the Rotate instructions. See 
Appendix C, “Assembler Extended Mnemonics” on 
page 223 for additional extended mnemonics. 

3.3.13.1 Fixed-Point Rotate Instructions 

These instructions rotate the contents of a register. 
The result of the rotation is 

■ Inserted into the target register under control of a 
mask (if a mask bit is 1 the associated bit of the 
rotated data is placed into the target register, 
and if the mask bit is 0 the associated bit in the 
target register remains unchanged); or 

■ ANDed with a mask before being placed into the 
target register. 

The Rotate Left instructions allow right-rotation of the 
contents of a register to be performed (in concept) by 
a left-rotation of 64—N, where N is the number of bits 
by which to rotate right. They allow right-rotation of 
the contents of the low-order 32 bits of a register to 
be performed (in concept) by a left-rotation of 32—N, 
where N is the number of bits by which to rotate right. 
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Rotate Left Doubleword Immediate then 
Clear Left MD-form 


rldicl RA,RS,SH,MB (Rc = 0) 

rldicl. RA,RS,SH,MB (Rc = 1) 


30 

RS 

RA 

sh 

mb 

0 

>h 

2 

0 

6 

ii 

16 

21 

27 

30 

m 


Rotate Left Doubleword Immediate then 
Clear Right MD-form 


rldicr RA,RS,SH,ME (Rc = 0) 

rldicr. RA,RS,SH,ME (Rc=1) 


30 

. RS 

RA 

sh 

me 

1 

>h 

Rc 

0 

6 

ii 

16 

21 

27 

30 

31 


n 4 - sh 5 || sh 0 . 4 
r *• ROTL 64 ((RS), n) 
b <- mb 5 || mb 0 4 
m 4 - MASK(b, 63) 

RA «- r & m 

The contents of register RS are rotated^ left SH bits. 
A mask is generated having 1-bits from bit MB 
through bit 63 and 0-bits elsewhere. The rotated data 
is ANDed with the generated mask and the result is 
placed into register RA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 


n <- sh 5 || sh 0 . 4 
r 4 - R0TL 64 ((RS), n) 
e 4 - me 5 || me 0 . 4 
m <- MASK(0, e) 

RA «- r & m 

The contents of register RS are rotated 64 left SH bits. 
A mask is generated having 1-bits from bit 0 through 
bit ME and 0-bits elsewhere. The rotated data is 
ANDed with the generated mask and the result is 
placed into register RA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 


Special Registers Altered: 

CRO (if Rc= 1) 

Extended Mnemonics: 

Examples of extended mnemonics for Rotate Left 
Doubleword Immediate then Clear Left: 


Special Registers Altered: 

CRO (if Rc = 1) 

Extended Mnemonics: 

Examples of extended mnemonics for Rotate Left 
Doubleword Immediate then Clear Right: 


Extended: Equivalent to: Extended: Equivalent to: 


extrdi 

Rx,Ry,n,b 

rldicl 

Rx,Ry,b + n,64—n 

extldi 

Rx,Ry,n,b 

rldicr 

Rx,Ry,b,n—1 

srdi 

Rx,Ry,n 

rldicl 

Rx,Ry,64—n,n 

sldi 

Rx,Ry,n 

rldicr 

Rx,Ry,n,63—n 

clrldi 

Rx,Ry,n 

rldicl 

Rx,Ry,0,n 

clrrdi 

Rx,Ry,n 

rldicr 

Rx,Ry,0,63—n 


— Programming Note - 

rldicl can be used to extract an n-bit field, that 
starts at bit position b in register RS, right- 
justified into register RA (clearing the remaining 
64 —n bits of RA), by setting SH = 6 + n and 
MB = 64— n. It can be used to rotate the contents 
of a register left (right) by n bits, by setting SH = n 
(64— n) and MB=0. It can be used to shift the 
contents of a register right by n bits, by setting 
SH = 64 —n and MB=n. It can be used to clear 
the high-order n bits of a register, by setting 
SH = 0 and MB — n. 

Extended mnemonics are provided for all of these 
uses: see Appendix C, “Assembler Extended 
Mnemonics” on page 223. 


- Programming Note - 

rldicr can be used to extract an n-bit field, that 
starts at bit position 6 in register RS, left-justified 
into register RA (clearing the remaining 64 —n bits 
of RA), by setting SH = 6 and ME=n—1. It can be 
used to rotate the contents of a register left 
(right) by n bits, by setting SH = n (64—n) and 
ME = 63. It can be used to shift the contents of a 
register left by n bits, by setting SH = n and 
ME = 63-n. It can be used to clear the low-order 
n bits of a register, by setting SH = 0 and 
ME = 63—n. 

Extended mnemonics are provided for all of these 
uses (some devolve to rldicl): see Appendix C, 
“Assembler Extended Mnemonics” on page 223. 
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Rotate Left Doubleword Immediate then Rotate Left Word Immediate then AND 
Clear MD-form with Mask M-form 


rldic RA,RS,SH,MB (Rc-O) 

rldic. RA,RS,SH,MB (Rc-1) 


30 

RS 

RA 

sh 

mb 

2 

>h 

Rc 

0 

6 

ii 

16 

21 

27 

0 



n <- sh 5 || sh 04 
r «- R0TL 64 ((RS), n) 
b «• mb 5 || mb 0:4 
m <r MASK(b, in) 

RA *■ r & m 

The contents of register RS are rotated 64 left SH bits. 
A mask is generated having 1-bits from bit MB 
through bit 63—SH, and 0-bits elsewhere. The rotated 
data is ANDed with the generated mask and the result 
is placed into register RA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc= 1) 

Extended Mnemonics: 

Example of extended mnemonics for Rotate Left 
Doubleword Immediate then Clear: 

Extended: Equivalent to: 

clrlsldi Rx,Ry,b,n rldic Rx,Ry,n,b—n 


rlwinm RA,RS,SH,MB,ME (Rc = 0) 

rlwinm. RA,RS,SH,MB,ME (Rc-1) 

[Power mnemonics: rlinm, rlinm.] 


21 

RS 

RA 

SH 

MB 

ME 

Rc 

0 

6 

ii 

16 

21 

26 

31 


n «■ SH 

r «- R0T1_32((RS ) 3 2 63, n) 
tn «- MASK(MB+32, ME+32) 

RA «- r & m 

The contents of register RS are rotated 32 left SH bits. 
A mask is generated having 1-bits from bit MB 
through bit ME and 0-bits elsewhere. The rotated 
data is ANDed with the generated mask and the result 
is placed into register RA. 

Special Registers Altered: 

CRO (if Rc= 1) 

Extended Mnemonics: 

Examples of extended mnemonics for Rotate Left 
Word Immediate then AND with Mask: 


Extended: 

extlwi Rx,Ry,n,b 
srwi Rx,Ry,n 
clrrwi Rx,Ry,n 


Equivalent to: 

rlwinm Rx,Ry,b,0,n—1 
rlwinm Rx,Ry,32—n,n,31 
rlwinm Rx,Ry,0,0,31—n 


— Programming Note - 

rldic can be used to clear the high-order b bits of 
the contents of a register and then shift the result 
left by n bits by setting SH=n and MB — b—n. It 
can be used to clear the high-order n bits of a 
register, by setting SH = 0 and MB = n. 

Extended mnemonics are provided for both of 
these uses (the second devolves to rldicl): see 
Appendix C, “Assembler Extended Mnemonics” 
on page 223. 
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— Programming Note - 

Let RSL represent the low-order 32 bits of reg¬ 
ister RS, with the bits numbered from 0 through 
31. 

rlwinm can be used to extract an /7-bit field, that 
starts at bit position b in RSL, right-justified into 
the low-order 32 bits of register RA (clearing the 
remaining 32— n bits of the low-order 32 bits of 
RA), by setting SH = 6 + / 7 , MB = 32-/7, and 
ME = 31. It can be used to extract an n-bit field, 
that starts at bit position b in RSL, left-justified 
into the low-order 32 bits of register RA (clearing 
the remaining 32— n bits of the low-order 32 bits 
of RA), by setting SH = 6, MB = 0, and ME = /7—1. 
It can be used to rotate the contents of the low- 
order 32 bits of a register left (right) by n bits, by 
setting SH = /? (32— n), MB = 0, and ME = 31. It can 
be used to shift the contents of the low-order 32 
bits of a register right by n bits, by setting 
SH = 32-/7, MB = /7, and ME = 31. It can be used to 
clear the high-order b bits of the low-order 32 bits 
of the contents of a register and then shift the 
result left by n bits by setting SH = /7, MB = b—n 
and ME = 31-/7. It can be used to clear the low- 
order n bits of the low-order 32 bits of a register, 
by setting SH =0, MB = 0, and ME = 31-/7. 

For all the uses given above, the high-order 32 
bits of register RA are cleared. 

Extended mnemonics are provided for all of these 
uses: see Appendix C, “Assembler Extended 
Mnemonics” on page 223. 


Rotate Left Doubleword then Clear Left 
MDS'form 


ridel RA,RS,RB,MB (Rc = 0) 

ridel. RA,RS,RB,MB (Rc=1) 


30 

RS 

RA 

RB 

mb 

8 

Rc 

0 

6 

ii 

16 

21 

27 

31 


n «- (RB) 58 . e o 
r - ROT 1.04((RS), n) 
b «- mb 5 || mb 0 4 
m «• MASK(b, 63) 

RA <- r & m 

The contents of register RS are rotated 64 left the 
number of bits specified by (RB) 58;63 . A mask is gen¬ 
erated having 1-bits from bit MB through bit 63 and 
0-bits elsewhere. The rotated data is ANDed with the 
generated mask and the result is placed into register 
RA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

GRO (if Rc= 1) 

Extended Mnemonics: 

Example of extended mnemonics for Rotate Left 
Doubleword then Clear Left. 

Extended: Equivalent to: 

rotld Rx,Ry,Rz ridel Rx,Ry,Rz,0 


- Programming Note - 

ridel can be used to extract an /7-bit field, that 
starts at variable bit position b in register RS, 
right-justified into register RA (clearing the 
remaining 64-/7 bits of RA), by setting 
RB 58 6 3 = 6 + /7 and MB = 64— n. It can be used to 
rotate the contents of a register left (right) by var¬ 
iable n bits by setting RB 58 63 = /7 (64-/7) and 
MB =0. 

Extended mnemonics are provided for some of 
these uses: see Appendix C, “Assembler 
Extended Mnemonics” on page 223. 
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Rotate Left Doubleword then Clear Right 
MDS-form 


rider RA,RS,RB,ME (Rc = 0) 

rider. RA,RS,RB,ME (Rc = 1) 


30 

RS 

RA 

RB 

me 

9 

Rc 

0 

6 

11 

16 

21 

27 

31 


n 4- (RB) 

r 4- ROTL 64 ((RS), n) 
e 4- me 5 || me 0 4 
m 4- MASK( 0 , e) 

RA <- r & rri 

The contents of register RS are rotated 64 left the 
number of bits specified by (RB ) 58 6 3 . A mask is gen¬ 
erated having 1-bits from bit 0 through bit ME and 
0-bits elsewhere. The rotated data is ANDed with the 
generated mask and the result is placed into register 
RA. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handier to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 

- Programming Note - 

rider can be used to extract an /7-bit field, that 
starts at variable bit position b in register RS, left- 
justified into register RA (clearing the remaining 
64-/7 bits of RA), by setting RB 58 ; 6 3=6 and 
ME = / 7 —1. It can be used to rotate the contents of 
a register left (right) by variable n bits by setting 
RB 58 . 63 = /7 ( 64—/j) and ME = 63. 

Extended mnemonics are provided for some of 
these uses (some devolve to ridel) see 
Appendix C, “Assembler Extended Mnemonics” 
on page 223. 


Rotate Left Word then AND with Mask 
M-form 


rlwnm RA,RS,RB,MB,ME (Rc = 0) 

rlwnm. RA,RS,RB,MB,ME (Rc=1) 

[Power mnemonics: rlnm, rlnm.] 
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MB 

ME 
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16 

21 

26 

31 


n 4- (RB) 59;63 

r 4- ROTL32((RS) 3 263» r>) 
m 4- MASK(MB+ 32 , ME+ 32 ) 

RA 4- r & m 

The contents of register RS are rotated 32 left the 
number of bits specified by (RB) 59 . 63 . A mask is gen¬ 
erated having 1 -bits from bit MB through bit ME and 
0-bits elsewhere. The rotated data is ANDed with the 
generated mask and the result is placed into register 
RA. 

Special Registers Altered: 

CRO (if Rc = 1) 

Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Word 
then AND with Mask: 

Extended: Equivalent to: 

rotlw Rx,Ry,Rz rlwnm Rx,Ry,Rz,0,31 

- Programming Note - 

Let RSL represent the low-order 32 bits of reg¬ 
ister RS, with the bits numbered from 0 through 
31. 

rlwnm can be used to extract an /7-bit field, that 
starts at variable bit position b in RSL, right- 
justified into the low-order 32 bits of register RA 
(clearing the remaining 32-/7 bits of the low-order 
32 bits of RA), by setting RB 59:63 = 6 + n, 
MB = 32-/7, and ME = 31. It can be used to extract 
an /7-bit field, that starts at variable bit position b 
in RSL, left-justified into the low-order 32 bits of 
register RA (clearing the remaining 32— n bits of 
the low-order 32 bits of RA), by setting RB 59 63 = b, 
MB = 0, and ME = n—1. It can be used to rotate 
the contents of the low-order 32 bits of a register 
left (right) by variable n bits, by setting RB 59 63 = /7 
(32-/7), MB = 0, and ME = 31. 

For all the uses given above, the high-order 32 
bits of register RA are cleared. 

Extended mnemonics are provided for some of 
these uses: see Appendix C, “Assembler 
Extended Mnemonics” on page 223. 
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Rotate Left Doubleword Immediate then Rotate Left Word Immediate then Mask 
Mask Insert MD-form Insert M-form 


rldimi RA,RS,SH,MB (Rc = 0) 

rldimi. RA,RS,SH,MB (Rc = 1) 


30 

RS 

RA 

sh 

mb 

3 

>h 

Rc 

0 

6 

ii 

16 

21 

27 

S3 

m 


n <- sh 5 || sh 0 . 4 
r 4- ROTL 64 ((RS), n) 
b 4- mb 5 || mb 0:4 
m <- MASK(b, -<n) 

RA *- r&m I (RA)&->m 

The contents of register RS are rotated 64 left SH bits. 
A mask is generated having 1-bits from bit MB 
through bit 63—SH, and 0-bits elsewhere. The rotated 
data is inserted into register RA under control of the 
generated mask. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 

Extended Mnemonics: 

Example of extended mnemonics for Rotate Left 
Doubleword Immediate then Mask Insert 

Extended: Equivalent to: 

insrdi Rx,Ry,n,b rldimi Rx,Ry,64—(b + n),b 

- Programming Note - 

rldimi can be used to insert an n-bit field, that is 
right-justified in register RS, into register RA 
starting at bit position b, by setting 
SH = 64 —(b + n) and MB = 6. 

An extended mnemonic is provided for this use: 
see Appendix C, “Assembler Extended 
Mnemonics” on page 223. 


riwimi RA,RS,SH,MB,ME (Rc = 0) 

rlwimi. RA,RS,SH,MB,ME (Rc=1) 

[Power mnemonics: rlimi, rlimi.] 
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RA 

SH 

MB 

ME 

Rc 
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n 4 - SH 

r 4 - R0TL 32 ( (RS) 3263 , n ) 
m 4- MASK(MB+32, ME+32) 

RA 4 - r&m I (RA)&-*m 

The contents of register RS are rotated 32 left SH bits. 
A mask is generated having 1-bits from bit MB 
through bit ME and 0-bits elsewhere. The rotated 
data is inserted into register RA under control of the 
generated mask. 

Special Registers Altered: 

CRO (if Rc = 1) 

Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Word 
Immediate then Mask Insert: 

Extended: Equivalent to: 

inslwi Rx,Ry,n,b rlwimi Rx,Ry,32—b,b,b + n—1 

- Programming Note - 

Let RAL represent the low-order 32 bits of reg¬ 
ister RA, with the bits numbered from 0 through 
31. 

rlwimi can be used to insert an /7-bit field, that is 
left-justified in the low-order 32 bits of register 
RS, into RAL starting at bit position b, by setting 
SH = 32— b, MB = 6, and ME = (b + n)—1. It can be 
used to insert an n-bit field, that is right-justified 
in the low-order 32 bits of register RS, into RAL 
starting at bit position b, by setting 
SH = 32—(b + n), MB = b, and ME = (6 + n)-1. 

Extended mnemonics are provided for both of 
these uses: see Appendix C, “Assembler 
Extended Mnemonics” on page 223. 


74 PowerPC Architecture First Edition 




3.3.13.2 Fixed-Point Shift Instructions 


The instructions in this section perform left and right 
shifts. 

Extended mnemonics for shifts 

Immediate-form logical (unsigned) shift operations are 
obtained by specifying appropriate masks and shift 
values for certain Rotate instructions. A set of 
extended mnemonics is provided to make coding of 
such shifts simpler and easier to understand, and 
simple rotates and shifts. Some of these are shown 
as examples with the Rotate instructions. See 
Appendix C, “Assembler Extended Mnemonics” on 
page 223 for additional extended mnemonics. 


- Programming Note - 

Any Shift Right Algebraic instruction, followed by 
addze, can be used to divide quickly by 2 N . The 
setting of the CA bit by the Shift Right Algebraic 
instructions is independent of mode. 


- Programming Note - 

Multiple-precision shifts can be programmed as 
shown in Appendix E.2, “Multiple-Precision Shifts” 
on page 247. 


Shift Left Doubleword X-form Shift Left Word X-form 


sld RA,RS,RB (Rc = 0) 

sld. RA,RS,RB (Rc = 1) 


31 

RS 

RA 

RB 

27 

Rc 

0 
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ii 

16 

21 

31 


n «- (RBUo.go 
r <- R0TL 64 ((RS), n) 
if (RB) 57 = 0 then 

in MASK(0, 63-n) 
el se m «- ^0 
RA <- r & m 

The contents of register RS are shifted left the 
number of bits specified by (RB) 5763 . Bits shifted out 
of position 0 are lost. Zeros are supplied to the 
vacated positions on the right. The result is placed 
into register RA. Shift amounts from 64 to 127 give a 
zero result. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc= 1) 


slw RA.RS.RB (Rc = 0) 

slw. RA,RS,RB (Rc = 1) 

[Power mnemonics: si, si.] 


31 

RS 

RA 

RB 

24 

Rc 
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n (RB)ijg.g 2 
r *■ R0TL3 2 ((RS) 3 2 ; 63> n) 
if (RB) 58 = 0 then 

m <- MASK(32, 63-n) 
else m «- 64 0 
RA «- r & m 

The contents of the low-order 32 bits of register RS 
are shifted left the number of bits specified by 
(RB) 58:63 . Bits shifted out of position 32 are lost. 
Zeros are supplied to the vacated positions on the 
right. The 32-bit result is placed into RA 32;63 . RA 0;31 
are set to zero. Shift amounts from 32 to 63 give a 
zero result. 

Special Registers Altered: 

CRO (if Rc — 1) 
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(Rc = 0) 
(Rc = 1) 


Shift Right Doubleword X-form 


srd RA,RS,RB (Rc = 0) 

srd. RA.RS.RB (Rc = 1) 


31 

RS 

RA 

RB 
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Rc 

0 
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21 

31 


n <T (RB) co m 
r ROTL 64 ((RS), 64-n) 
if (RB ) 57 = 0 then 
m 4 - MASK(n, 63) 
else m <- 64 0 
RA «- r & m 

The contents of register RS are shifted right the 
number of bits specified by (RB) 57;63 . Bits shifted out 
of position 63 are lost. Zeros are supplied to the 
vacated positions on the left. The result is placed into 
register RA. Shift amounts from 64 to 127 give a zero 
result. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CRO (if Rc = 1) 


Shift Right Word X-form 


srw RA,RS,RB 

srw. RA,RS,RB 

[Power mnemonics: sr, sr.] 


31 

RS 

RA 

RB 
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Rc 
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n (RB)jg.g 3 

r *■ ROTL 32 ((RS) 3263 > 64-n) 
if (RB ) 58 = 0 then 

m «■ MASK(n+32, 63) 
else m «- 64 0 
RA «• r & m 

The contents of the low-order 32 bits of register RS 
are shifted right the number of bits specified by 
(RB) 53 :63 . Bits shifted out of position 63 are lost. 
Zeros are supplied to the vacated positions on the 
left. The 32-bit result is placed into RA 326 3 . RAo:3i 
are set to zero. Shift amounts from 32 to 63 give a 
zero result. 

Special Registers Altered: 

CRO (if Rc = 1) 
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Shift Right Algebraic Doubleword 
Immediate XS-form 


sradi RA,RS,SH (Rc = 0) 

sradi. RA,RS,SH (Rc = 1) 


31 

RS 

RA 

sh 

413 

Eh 


0 
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16 

21 

El 

m 


n <- sh 5 || sh 04 
r <- ROTL 64 ((RS), 64-n) 
m «- MASK(n, 63) 
s (RS) 0 

RA 4- r&m I ( 64 s)&-m 
CA s & ((r&-rni);/0) 

The contents of register RS are shifted right SH bits. 
Bits shifted out of position 63 are lost. Bit 0 of RS is 
replicated to fill the vacated positions on the left. The 
result is placed into register RA. CA is set to 1 if (RS) 
is negative and any 1-bits are shifted out of position 
63; otherwise CA is set to 0. A shift amount of zero 
causes RA to be set equal to (RS), and CA to be set 
to 0. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CA 

CRO (if Rc = 1) 


Shift Right Algebraic Word Immediate 
X-form 


srawi RA,RS,SH (Rc = 0) 

srawi. RA,RS,SH (Rc = 1) 

[Power mnemonics: srai, srai.] 


31 

RS 

RA 

SH 
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Rc 
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n «• SH 

r *• R0TL 32 ( (RS) 32; e3» 64-n) 
m 4- MASK(n+32, 63) 
s «• (RS) 32 
RA «- r&m I ( 64 s)&->m 
CA *■ s & ((r8nm) 32:63 j*0) 

The contents of the low-order 32 bits of register RS 
are shifted right SH bits. Bits shifted out of position 
63 are lost. Bit 32 of RS is replicated to fill the 
vacated positions on the left. The 32-bit result is 
placed into RA 32:63 . Bit 32 of RS is replicated to fill 
RA 0;31 . CA is set to 1 if the low-order 32 bits of (RS) 
contain a negative number and any 1-bits are shifted 
out of position 63; otherwise CA is set to 0. A shift 
amount of zero causes RA to receive EXTS((RS) 32:63 ), 
and CA to be set to 0. 

Special Registers Altered: 

CA 

CRO (if Rc = 1) 
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Shift Right Algebraic Doubleword 
X-form 


srad RA,RS,RB (Rc = 0) 

srad. RA,RS,RB (Rc = 1) 
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n (RBj^g gg 
r <- ROTL 64 ((RS), 64-n) 
if (RB) 57 = 0 then 
m *■ MASK(n, 63) 
else m ^ 64 0 
s> (RS)o, 

RA <- r&m I ( 64 s)&-.m 
CA «- s & ((r&-mi)^0) 

The contents of register RS are shifted right the 
number of bits specified by (RB) 57;63 . Bits shifted out 
of position 63 are lost. Bit 0 of RS is replicated to fill 
the vacated positions on the left. The result is placed 
into register RA. CA is set to 1 if (RS) is negative and 
any 1 -bits are shifted out of position 63; otherwise CA 
is set to 0. A shift amount of zero causes RA to be 
set equal to (RS), and CA to be set to 0. Shift 
amounts from 64 to 127 give a result of 64 sign bits in 
RA, and cause CA to receive the sign bit of (RS). 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

CA 

CRO (if Rc = 1) 


Shift Right Algebraic Word X-form 


sraw RA,RS,RB (Rc = 0) 

sraw. RA.RS.RB (Rc=1) 

[Power mnemonics: sra, sra.] 
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n <■ (RBJjggg 

r *■ ROTL32((RS)32 ; 63» 04-n) 
if (RB) 58 = 0 then 

m «• MASK(n+32, 63) 
else m «- M 0 
s «- (RS) 3 o 
RA <- r&m 1 ( 64 s)&->m 
CA * s & ((r&-mi) 32^0) 

The contents of the low-order 32 bits of register RS 
are shifted right the number of bits specified by 
(RB) 58 ; 63 . Bits shifted out of position 63 are lost. Bit 
32 of RS is replicated to fill the vacated positions on 
the left. The 32-bit result is placed into RA 32 .g 3 . Bit 
32 of RS is replicated to fill RA 0;31 . CA is set to 1 if 
the low-order 32 bits of (RS) contain a negative 
number and any 1-bits are shifted out of position 63; 
Otherwise CA is set to 0. A shift amount of zero 
causes RA to receive EXTS((RS) 32; g 3 ), and CA to be 
set to 0. Shift amounts from 32 to 63 give a result of 
64 sign bits, and cause CA to receive the sign bit of 

(RSW 

Special Registers Altered: 

CA 

CRO (if Rc = 1) 
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3.3.14 Move To/From System Register Instructions 


Extended mnemonics 

A set of extended mnemonics is provided for the 
mtspr and mfspr instructions so that they can be 


coded with the SPR name as part of the mnemonic 
rather than as a numeric operand. Some of these are 
shown as examples with the two instructions. See 
Appendix C, “Assembler Extended Mnemonics” on 
page 221 for additional extended mnemonics. 


Move To Special Purpose Register 
XFX-form 


mtspr SPR.RS 


31 

RS 

spr 

467 

/ 

0 

6 


21 

31 


n <- spr 5;9 || spr 0;4 
if length(SPREG(n)) = 64 then 

SPREG(n) «- (RS) 
else 

SPREG(n) *■ (RS) 32 : 63 ( 0 : 31 } 

The SPR field denotes a Special Purpose Register, 
encoded as shown in the table below. The contents of 
register RS are placed into the designated Special 
Purpose Register. For Special Purpose Registers that 
are 32 bits long, the low-order 32 bits of RS are 
placed into the SPR. 


decimal 

SPR' 

spr 5;9 spr 0:4 

Register 

name 

1 

00000 00001 

XER 

8 

00000 01000 

LR 

9 

00000 01001 

CTR 

* Note that the order of the two 5-bit 

halves of the SPR number is reversed. 


Additional values of the SPR field are defined in 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141, and others may be 
defined in Book IV, PowerPC Implementation Features 
for the implementation. If the SPR field contains any 
value other than one of these implementation-specific 
values or one of the values shown above or in Book 
III, the instruction form is invalid. However, the only 
effect of executing an invalid instruction form in which 
spr 0 =1 is to invoke either the system privileged 
instruction error handler or the system illegal instruc¬ 
tion error handler. 

Special Registers Altered: 

See above 


Extended Mnemonics: 

Examples of extended mnemonics for Move To 
Special Purpose Register : 


Extended: 

Equivalent to: 

mtxer 

Rx 

mtspr 1,Rx 

mtlr 

Rx 

mtspr 8,Rx 

mtctr 

Rx 

mtspr 9,Rx 


— Compiler and Assembler Note - 

For the mtspr and mfspr instructions, the SPR 
number coded in assembler language does not 
appear directly as a 10-bit binary number in the 
instruction. The number coded is split into two 
5-bit halves that are reversed in the instruction, 
with the high-order 5 bits appearing in bits 16:20 
of the instruction and the low-order 5 bits in bits 
11:15. This maintains compatibility with Power 
SPR encodings, in which these two instructions 
had only a 5-bit SPR field occupying bits 11:15. 


- Compatibility Note - 

For a discussion of Power compatibility with 
respect to SPR numbers not shown in the instruc¬ 
tion descriptions for mtspr and mfspr, please refer 
to Appendix G, “Incompatibilities with the Power 
Architecture” on page 255. For compatibility with 
future versions of this architecture, only SPR 
numbers discussed in these instruction 
descriptions should be used. 
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Move From Special Purpose Register 
XFX-form 


Move To Condition Register Fields 
XFX-form 


mfspr RT,SPR mtcrf FXM.RS 


31 

RT 

spr 

339 

/ 

0 

6 

11 

21 

31 


31 

RS 

/ 

FXM 

/ 

144 

/ 

0 

6 

11 

12 

20 

21 

31 


n 4 - spr 5:9 || spr 0:4 
if length(SPREG(n)) = 64 then 

RT 4 - SPREG(n) 
else 

RT 4- 32 0 || sPREG(n) 

The SPR field denotes a Special Purpose Register, 
encoded as shown in the table below. The contents of 
the designated Special Purpose Register are placed 
into register RT. For Special Purpose Registers that 
are 32 bits long, the low-order 32 bits of RT receive 
the contents of the Special Purpose Register and the 
high-order 32 bits of RT are set to zero. 


decimal 

SPR* 

s P r 5;9 spr 0;4 

Register 

name 

1 

00000 00001 

XER 

8 

00000 01000 

LR 

9 

00000 01001 

CTR 

* Note that the order of the two 5-bit 

halves of the SPR number is reversed. 


Additional values of the SPR field are defined in 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141, and others may be 
defined in Book IV, PowerPC Implementation Features 
for the implementation. If the SPR field contains any 
value other than one of these implementation-specific 
values or one of the values shown above or in Book 
III, the instruction form is invalid. However, the only 
effect of executing an invalid instruction form in which 
spr 0 =1 is to invoke either the system privileged 
instruction error handler or the system illegal instruc¬ 
tion error handler. 

Special Registers Altered: 

None 

Extended Mnemonics: 

Examples of extended mnemonics for Move From 
Special Purpose Register : 


mask 4- 4 (FXMo) || 4 (FXM,) || ... 4 (FXM 7 ) 

CR <- ((RS ) 3 263 & mask) I (CR & -'mask) 

The contents of bits 32:63 of register RS are placed 
into the Condition Register under control of the field 
mask specified by FXM. The field mask identifies the 
4-bit fields affected. Let i be an integer in the range 
0-7. If FXM(i) = 1 then CR field i (CR bits 4xi through 
4xi + 3) is set to the contents of the corresponding 
field of the low-order 32 bits of RS. 

Special Registers Altered: 

CR fields selected by mask 

- Programming Note -- 

Updating a proper subset of the eight fields of the 
Condition Register may have substantially poorer 
performance on some implementations than 
updating all of the fields. 


Move to Condition Register from XER 
X-form 


mcrxr BF 


31 

BF 

// 

III 

III 

512 

/ 

0 

6 

9 

11 

16 

21 

31 


CR4*bf:4xbf+3 *■ XER 0 ;3 
XER 0:3 4- 0b0000 

The contents of XER 0:3 are copied into the Condition 
Register field designated by BF. XER 0;3 is set to zero. 

Special Registers Altered: 

CR XER 0:3 


Extended: 

Equivalent to: 

mfxer 

Rx 

mfspr Rx,1 

mflr 

Rx 

mfspr Rx,8 

mfctr 

Rx 

mfspr Rx,9 


— Compiler/Assembler/Compatibility Notes 
See the Notes that appear with mfspr. 
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Move From Condition Register X-form 


mfcr RT 


31 

RT 

III 

III 

19 

/ 

0 

6 

11 

16 

21 

31 


RT <- 32 0 || CR 

The contents of the Condition Register are placed into 
^ 32 : 63 - RT 0 3 i are set to 0. 

Special Registers Altered: 

None 
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Chapter 4. Floating-Point Processor 


4.1 Floating-Point Processor 
Overview 

The Floating-Point Processor provides high perform¬ 
ance execution of floating-point operations. 
Instructions are provided to perform arithmetic, con¬ 
version, comparison, and other operations in floating¬ 
point registers, and to move floating-point data 
between storage and these registers. Instructions in 
the first group are called “arithmetic instructions,” 
and instructions in the second group are called 
“storage access instructions.” Instructions are also 
provided that manipulate the Floating-Point Status 
and Control Register. 

This architecture provides for the processor to imple¬ 
ment a floating-point system as defined in ANSI/IEEE 
Standard 754-1985, “IEEE Standard for Binary 
Floating-Point Arithmetic” (hereafter referred to as 
“the IEEE standard”), but has a dependency on sup¬ 
porting software to be in “conformance” with that 
standard. All floating-point operations conform to that 
standard, except if software sets the Floating-Point 
Non-IEEE Mode (Nl) bit in the Floating-Point Status 
and Control Register to 1 (see page 86), in which case 
floating-point operations do not necessarily conform 
to that standard. 

A floating-point number consists of a signed exponent 
and a signed significand. The quantity expressed by 
this number is the product of the significand and the 
number 2 exponent . Encodings are provided in the data 
format to represent finite numeric values, ^Infinity, 
and values which are “Not a Number” (NaN). Oper¬ 
ations involving infinities produce results obeying tra¬ 
ditional mathematical conventions. NaNs have no 
mathematical interpretation. Their encoding permits 
a variable diagnostic information field. They may be 
used to indicate such things as uninitialized variables 
and can be produced by certain invalid operations. 

There is one class of exceptional events which occur 
during instruction execution which are unique to the 
Floating-Point Processor: 


■ Floating-Point Exception 

Floating-point exceptions are signalled with bits set in 
the Floating-Point Status and Control Register 
(FPSCR). They can cause the system floating-point 
enabled exception error handler to be invoked, pre¬ 
cisely or imprecisely, if the proper control bits are set. 

Floating-Point Exceptions 


The following floating-point exceptions are detected 
by the processor: 


■ 

Invalid Operation Exception 

(VX) 


SNaN 

(VXSNAN) 


Infinity—Infinity 

(VXISI) 


Infinity-f-lnfinity 

(VXIDI) 


Zero-fZero 

(VXZDZ) 


InfinityxZero 

(VXIMZ) 


Invalid Compare 

(VXVC) 


Software Request 

(VXSOFT) 


Invalid Square Root 

(VXSORT) 


Invalid Integer Convert 

(VXCVI) 

■ 

Zero Divide Exception 

(ZX) 

■ 

Overflow Exception 

(OX) 

■ 

Underflow Exception 

(UX) 

■ 

Inexact Exception 

(XX) 


Each floating-point exception, and each category of 
Invalid Operation Exception, has an exception bit in 
the FPSCR. In addition, each floating-point exception 
has a corresponding enable bit in the FPSCR. See 
Section 4.2.2, “Floating-Point Status and Control 
Register” on page 84, for a description of these 
exception and enable bits, and Section 4.4, “Floating- 
Point Exceptions” on page 90, for a detailed dis¬ 
cussion of floating-point exceptions, including the 
effects of the enable bits. 

4.2 Floating-Point Processor 
Registers 
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4.2.1 Floating-Point Registers 

Implementations of this architecture provide 32 
floating-point registers (FPR). The floating-point 
instruction formats provide a 5-bit field for specifying 
the FPRs to be used in the execution of the instruc¬ 
tion. The FPRs are numbered 0-31. 

Each FPR contains 64 bits which support the floating¬ 
point double format. Every instruction that interprets 
the contents of an FPR as a floating-point value uses 
the floating-point double format for this interpretation. 

Every floating-point arithmetic instruction operates on 
data located in FPRs and, with the exception of the 
Compare instructions, places the result value into an 
FPR. Status information is placed into the Floating- 
Point Status and Control Register and in some cases 
into the Condition Register. 

Load and store double instructions are provided that 
transfer 64 bits of data between storage and the FPRs 
in the Floating-Point Processor with no conversion. 
Load single instructions are provided to transfer and 
convert floating-point values in floating-point single 
format from storage to the same value in floating¬ 
point double format in the FPRs. Store single 
instructions are provided to transfer and convert 
floating-point values in floating-point double format 
from the FPRs to the same value in floating-point 
single format in storage. 

Single- and double-precision arithmetic instructions 
accept values from the FPRs in double format. For 
single-precision arithmetic instructions, all input 
values must be representable in single format: if they 
are not, the result placed into the target FPR, and the 
setting of status bits in the FPSCR and in the Condi¬ 
tion Register (if Rc= 1), are undefined. 

The arithmetic instructions produce intermediate 
results which may be regarded as being infinitely 
precise. After normalization or denormalization, if the 
infinitely precise intermediate result is not represent¬ 
able in the destination format (either 32-bit or 64-bit) 
then it is rounded. The final result is then placed into 
the FPR in the double format. 



0 63 


Figure 23. Floating-Point Registers 


4.2.2 Floating-Point Status and 
Control Register 

The Floating-Point Status and Control Register 
(FPSCR) controls the handling of floating-point excep¬ 
tions and records status resulting from the floating¬ 
point operations. Bits 0:23 are status bits. Bits 24:31 
are control bits. 

The exception bits in the FPSCR (bits 0:12, 21:23) are 
sticky, with the exception of Floating-Point Enabled 
Exception Summary (FEX) and Floating-Point Invalid 
Operation Exception Summary (VX). That is, once set 
the sticky bits remain set until they are cleared by an 
mcrfs, mtfsfi, mtfsf, or mtfsbO instruction. 

FEX and VX are simply the ORs of other FPSCR bits. 
Therefore these two bits are not listed among the 
FPSCR bits affected by the various instructions. 


FPSCR 

0 31 

Figure 24. Floating-Point Status and Control Register 

The format of the FPSCR is: 

Bit(s) Description 

0 Floating-Point Exception Summary (FX) 

Every floating-point instruction shall implicitly 
set FPSCRpx if that instruction causes any of 
the floating-point exception bits in the FPSCR to 
transition from 0 to 1 . mcrfs shall implicitly 
reset FPSCRpx if the FPSCR field containing 
FPSCRpx is copied, mtfsf, mtfsfi, mtfsbO, and 
mtfsbl shall be able to set or clear FPSCRpx 
explicitly. 

1 Floating-Point Enabled Exception Summary 

(FEX) 

This bit signals the occurrence of any of the 
enabled exception conditions. It is the OR of all 
the floating-point exceptions masked with their 
respective enables, mcrfs shall implicitly reset 
FPSCR fex if the result of the logical operation 
described above becomes zero, mtfsf, mtfsfi, 
mtfsbO, and mtfsbl cannot set or clear 
FPSCR fex explicitly. 

2 Floating-Point Invalid Operation Exception 

Summary (VX) 

This bit signals the occurrence of any invalid 
operation exception. It is the OR of all the 
Invalid Operation exceptions. mcrfs shall 
implicitly reset FPSCRpx if the result of the 
logical operation described above becomes 
zero, mtfsf, mtfsfi, mtfsbO, and mtfsbl cannot 
set or clear FPSCRyx explicitly. 

3 Floating-Point Overflow Exception (OX) 

See Section 4.4.3, “Overflow Exception” on 
page 94. 
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4 Floating-Point Underflow Exception (UX) 

See Section 4.4.4, “Underflow Exception” on 
page 94. 

5 Floating-Point Zero Divide Exception (ZX) 

See Section 4.4.2, “Zero Divide Exception” on 
page 94. 

6 Floating-Point Inexact Exception (XX) 

See Section 4.4.5, “Inexact Exception” on 
page 95. 

FPSCRxx is a sticky version of FPSCR r (see 
below). Thus the following rules completely 
describe how FPSCR** is set by a given instruc¬ 
tion. 

■ If the instruction affects FPSCR F) , the new 
value of FPSCR^ is obtained by ORing the 
old value of FPSCRr with the new value of 
FPSCR fi . 

■ If the instruction does not affect FPSCR r , 
the value of FPSCRr is unchanged. 

7 Floating-Point Invalid Operation Exception 

(SNaN) (VXSNAN) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

8 Floating-Point Invalid Operation Exception 

(oo-oo) (VXISI) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

9 Floating-Point Invalid Operation Exception 

(oo-rco) (VXIDI) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

10 Floating-Point Invalid Operation Exception 

(0-7-0) (VXZDZ) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

11 Floating-Point Invalid Operation Exception 

(ooXO) (VXIMZ) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

12 Floating-Point Invalid Operation Exception 

(Invalid Compare) (VXVC) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

13 Floating-Point Fraction Rounded (FR) 

The last floating-point instruction that poten¬ 
tially rounded the intermediate result incre¬ 
mented the fraction (see Section 4.3.6, 
“Rounding” on page 90). This bit is not sticky. 

14 Floating-Point Fraction Inexact (FI) 

The last floating-point instruction that poten¬ 
tially rounded the intermediate result produced 
an inexact fraction or a disabled Overflow 
Exception (see Section 4.3.6, “Rounding” on 
page 90). This bit is not sticky. 

See the definition of FPSCRr, above, regarding 
the relationship between FPSCR r and FPSCRr. 


15:19 Floating-Point Result Flags (FPRF) 

This field is set as described below. For 
floating-point instructions other than the 
Compare instructions, the field is set based on 
the result placed into the target register, except 
that if any portion of the result is undefined 
then the value placed into the FPRF is unde¬ 
fined. 

15 Floating-Point Result Class Descripter (C) 
Floating-point instructions other than the 
Compare instructions may set this bit with the 
FPCC bits, to indicate the class of the result as 
shown in Figure 25 on page 86. 

16:19 Floating-Point Condition Code (FPCC) 

Floating-point Compare instructions set one of 
the FPCC bits to one and the other three FPCC 
bits to zero. Other floating-point instructions 
may set the FPCC bits with the C bit, to indicate 
the class of the result as shown in Figure 25 on 
page 86. Note that in this case the high-order 
three bits of the FPCC retain their relational sig¬ 
nificance indicating that the value is less than, 
greater than, or equal to zero. 

16 Floating-Point Less Than or Negative (FL or <) 

17 Floating-Point Greater Than or Positive (FG or 

>) 

18 Floating-Point Equal or Zero (FE or =) 

19 Floating-Point Unordered or NaN (FU or ?) 

20 Reserved 

21 Floating-Point Invalid Operation Exception 
(Software Request) (VXSOFT) 

This bit can be altered only by mcrfs, mtfsfi, 
mtfsf, mtfsbO, or mtfsbl. See Section 4.4.1, 
“Invalid Operation Exception” on page 93. 

22 Floating-Point Invalid Operation Exception 
(Invalid Square Root) (VXSORT) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 


23 Floating-Point Invalid Operation Exception 

(Invalid Integer Convert) (VXCVI) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 

24 Floating-Point Invalid Operation Exception 

Enable (VE) 

See Section 4.4.1, “Invalid Operation Exception” 
on page 93. 


— Programming Note - 

If the implementation does not support the 
Floating Square Root instruction or the 
Floating Reciprocal Square Root Estimate 
instruction, software can simulate the 
instruction and set this bit to reflect the 
exception. 
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25 Floating-Point Overflow Exception Enable (OE) 
See Section 4.4.3, “Overflow Exception” on 
page 94. 

26 Floating-Point Underflow Exception Enable (UE) 
See Section 4.4.4, “Underflow Exception” on 
page 94. 

27 Floating-Point Zero Divide Exception Enable 
(ZE) 

See Section 4.4.2, “Zero Divide Exception” on 
page 94. 

28 Floating-Point Inexact Exception Enable (XE) 

See Section 4.4.5, “Inexact Exception” on 
page 95. 

29 Floating-Point Non-IEEE Mode (Nl) 

If this bit is set to 1, the remaining FPSCR bits 
may have meanings other than those given in 
this document, and the results of floating-point 
instructions need not conform to the IEEE 
standard. If the lEEE-conforming result of a 
floating-point arithmetic instruction would be a 
denormalized number, the result of that instruc¬ 
tion is 0 (with the same sign as the denormal¬ 
ized number) if FPSCR N , = 1 and other 
requirements specified in Book IV, PowerPC 
Implementation Features, for the implementa¬ 
tion are met. The other effects of setting this 
bit to 1 are described in Book IV, and may differ 
between implementations. 

30:31 Floating-Point Rounding Control (RN) 

See Section 4.3.6, “Rounding” on page 90. 

00 Round to Nearest 
01 Round toward Zero 

10 Round toward -flnfinity 

11 Round toward —Infinity 


4.3 Floating-Point Data 

4.3.1 Data Format 

This architecture defines the representation of a 
floating-point value in two different binary fixed length 
formats. The format may be a 32-bit single format for 
a single-precision value or a 64-bit double format for 
a double-precision value. The single format may be 
used for data in storage. The double format format 
may be used for data in storage and for data in 
floating-point registers. 

The length of the exponent and the fraction fields 
differ between these two formats. The structure of 
the single and double formats is shown below: 


IfL 

o 1 


EXP 


9 


FRACTION 


31 


Figure 26. Floating-Point Single Format 


S 


EXP 


FRACTION 


o 1 


12 


63 


Figure 27. Floating-Point Double Format 

Values in floating-point format are composed of three 
fields: 

S sign bit 

EXP exponent + bias 

FRACTION fraction 


Result 

Flags 

Result Value Class 

C 

< 

> 

= 

? 

1 

0 

0 

0 

1 

Quiet NaN 

0 

1 

0 

0 

1 

— Infinity 

0 

1 

0 

0 

0 

— Normalized Number 

1 

1 

0 

0 

0 

— Denormalized Number 

1 

0 

0 

1 

0 

— Zero 

0 

0 

0 

1 

0 

+ Zero 

1 

0 

1 

0 

0 

+ Denormalized Number 

0 

0 

1 

0 

0 

+ Normalized Number 

0 

0 

1 

0 

1 

+ Infinity 


Figure 25. Floating-Point Result Flags 


If only a portion of a floating-point data item in 
storage is accessed, such as with a load or store 
instruction for a byte or halfword (or word in the case 
of floating-point double format), the value affected will 
depend on whether the PowerPC system is operating 
with Big-Endian byte order (the default), or Little- 
Endian byte order. See Appendix D, “Little-Endian 
Byte Ordering” on page 235. 

Representation of numerical values in the floating¬ 
point formats consist of a sign bit S, a biased expo¬ 
nent EXP, and the fraction portion FRACTION of the 
significand. The significand consists of a leading 
implied bit concatenated on the right with the FRAC¬ 
TION. This leading implied bit is a one for normalized 
numbers and a zero for denormalized numbers and is 
located in the unit bit position (i.e. the first bit to the 
left of the binary point). Values representable within 
the two floating-point formats can be specified by the 
parameters listed in Figure 28 on page 87. 
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Format 

Single 

Double 

Exponent Bias 

+ 127 

+ 1023 

Maximum Exponent 

+ 127 

+ 1023 

Minimum Exponent 

-126 

-1022 

Widths (bits) 



Format 

32 

64 

Sign 

1 

1 

Exponent 

8 

11 

Fraction 

23 

52 

Significand 

24 

53 


Figure 28. IEEE Floating-Point Fields 


The architecture requires that the FPRs of the 
Floating-Point Processor support the arithmetic 
instructions on values in the floating-point double 
format only. 

4.3.2 Value Representation 

This architecture defines numerical and non-numerical 
values representable within each of the two supported 
formats. The numerical values are approximations to 
the real numbers and include the normalized 
numbers, denormalized numbers, and zero values. 
The non-numerical values representable are the infin¬ 
ities, and the Not a Numbers (NaNs). The infinities 
are adjoined to the real numbers, but are not 
numbers themselves, and the standard rules of arith¬ 
metic do not hold when they appear in an operation. 
They are related to the reals by order alone. It is 
possible however to define restricted operations 
among numbers and infinities as defined below. The 
relative location on the real number line for each of 
the defined entities is shown in Figure 29. 


-INF 

-NOR 

-DEN 

-0 

+0 

+DEN 

+N0R 

+INF 








* 


Figure 29. Approximation to Real Numbers 

The NaNs are not related to the numbers or infinities 
by order or value but are encodings used to convey 
diagnostic information such as the representation of 
uninitialized variables. 

The following is a description of the different floating¬ 
point values defined in the architecture: 

Binary floating-point numbers 

Machine representable values used as approxi¬ 
mations to real numbers. Three categories of 
numbers are supported: normalized numbers, denor¬ 
malized numbers, and zero values. 


Normalized numbers (+NOR) 

These are values which have a biased exponent value 
in the range: 

1 to 254 in single format 
1 to 2046 in double format 

They are values in which the implied unit bit is one. 
Normalized numbers are interpreted as follows: 

NOR — (—1) s x 2 E x (1.fraction) 

where (s) is the sign, (E) is the unbiased exponent and 
(1.fraction) is the significand which is composed of a 
leading unit bit (implied bit) and a fraction part. 

The ranges covered by the magnitude (M) of a nor¬ 
malized floating-point number are approximately 
equal to: 

Single Format: 

1.2x10- 38 < M < 3.4x10 38 

Double Format: 

2.2x10" 308 < M < 1.8x10 308 

Zero values (+0) 

These are values which have a biased exponent value 
of zero and a fraction value of zero. Zeros can have 
a positive or negative sign. The sign of zero is 
ignored by comparison operations (i.e., comparison 
regards + 0 as equal to —0). 

Denormalized numbers (+DEN) 

These are values which have a biased exponent value 
of zero and a non-zero fraction value. They are non¬ 
zero numbers smaller in magnitude than the repre¬ 
sentable normalized numbers. They are values in 
which the implied unit bit is zero. Denormalized 
numbers are interpreted as follows: 

DEN = (-1) s x 2 Emin x (O.fraction) 

where Emin is the minimum representable exponent 
value (—126 for single-precision, —1022 for double¬ 
precision). 

Infinities (+oo) 

These are values which have the maximum biased 
exponent value: 

255 in the single format 
2047 in the double format 

and a zero fraction value. They are used to approxi¬ 
mate values greater in magnitude than the maximum 
normalized value. 

Infinity arithmetic is defined as the limiting case of 
real arithmetic, with restricted operations defined 
among numbers and infinities, infinities and the reals 
can be related by ordering in the affine sense: 

—oo < every finite number < +oo 
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Arithmetic on infinities is always exact and does not 
signal any exception, except when an exception 
occurs due to the invalid operations as described in 
Section 4.4.1, “Invalid Operation Exception” on 
page 93. 

Not a Numbers (NaNs) 

These are values which have the maximum biased 
exponent value and a non-zero fraction value. The 
sign bit is ignored (i.e. NaNs are neither positive nor 
negative). If the high-order bit of the fraction field is 
a zero then the NaN is a Signalling NaN, otherwise it 
is a Quiet NaN. 

Signalling NaNs are used to signal exceptions when 
they appear as arithmetic operands. 

Quiet NaNs are used to represent the results of 
certain invalid operations, such as invalid arithmetic 
operations on infinities or on NaNs, when Invalid 
Operation Exception is disabled (FPSCR VE = 0). Quiet 
NaNs propagate through all operations except ordered 
comparison, Floating Round to Single-Precision, and 
conversion to integer. Quiet NaNs do not signal 
exceptions, except for ordered comparison and con¬ 
version to integer operations. Specific encodings, in 
QNaNs, can thus be preserved through a sequence of 
operations, and used to convey diagnostic information 
to help identify results from invalid operations. 

When a QNaN is the result of an operation because 
one of the operands is a NaN or because a QNaN was 
generated due to a disabled Invalid Operation Excep¬ 
tion, then the following rule is applied to determine 
the NaN with the high-order fraction bit set to one that 
is to be stored as the result. 

if (FRA) is a NaN 
then FRT <- (FRA) 
else if (FRB) is a NaN 

then if instruction is frsp 

then FRT <- (FRB) 034 || 29 0 
else FRT <- (FRB) ' 
else if (FRC) is a NaN 
then FRT - (FRC) 
else if generated QNaN 

then FRT 4 - generated QNaN 

If the operand specified by FRA is a NaN, then that 
NaN is stored as the result. Otherwise, if the operand 
specified by FRB is a NaN (if the instruction specifies 
an FRB operand), then that NaN is stored as the 
result, with the low-order 29 bits of the result set to 0 
if the instruction is frsp. Otherwise, if the operand 
specified by FRC is a NaN (if the instruction specifies 
an FRC operand), then that NaN is stored as the 
result. Otherwise, if a QNaN was generated due to a 
disabled Invalid Operation Exception, then that QNaN 
is stored as the result. If a QNaN is to be generated 
as a result, then the QNaN generated has a sign bit of 
zero, an exponent field of all ones, and a high-order 
fraction bit of one with all other fraction bits zero. 
Any instruction that generates a QNaN as the result of 


a disabled Invalid Operation must generate this QNaN 
(i.e., Ox7FF8_0000_0000_0000). 

A double-precision NaN is considered to be represent¬ 
able in single format if and only if the low-order 29 
bits of the double-precision NaN's fraction are zero. 

4.3.3 Sign of Result 

The following rules govern the sign of the result of an 
arithmetic operation, when the operation does not 
yield an exception. They apply even when the oper¬ 
ands or results are zeros or infinities. 

■ The sign of the result of an addition operation is 
the sign of the operand having the larger abso¬ 
lute value. If both operands have the same sign, 
the sign of the result of an addition operation is 
the same as the sign of the operands. The sign 
of the result of the subtraction operation x—y is 
the same as the sign of the result of the addition 
operation x + (—y). 

When the sum of two operands with opposite 
sign, or the difference of two operands with the 
same sign, is exactly zero, the sign of the result 
is positive in all rounding modes except Round 
toward —Infinity, in which mode the sign is nega¬ 
tive. 

■ The sign of the result of a multiplication or divi¬ 
sion operation is the Exclusive OR of the signs of 
the operands. 

■ The sign of the result of a Square Root or Recip¬ 
rocal Square Root Estimate operation is always 
positive, except that the square root of —0 is —0 
and the reciprocal square root of —0 is —Infinity. 

■ The sign of the result of a Round to Single- 
Precision or Convert to/from Integer operation is 
the sign of the operand being converted. 

For the Multiply-Add instructions, the rules given 
above are applied first to the multiplication operation 
and then to the addition or subtraction operation (one 
of the inputs to the addition or subtraction operation 
is the result of the multiplication operation). 

4.3.4 Normalization and 
Denormalization 

When an arithmetic operation produces an interme¬ 
diate result, consisting of a sign bit, an exponent, and 
a non-zero significand with a zero leading bit, it is not 
a normalized number and must be normalized before 
it is stored. 

A number is normalized by shifting its significand left 
while decrementing its exponent by one for each bit 
shifted, until the leading significand bit becomes one. 
The guard bit and the round bit (see Section 4.5.1, 
“Execution Model for IEEE Operations” on page 96) 


88 PowerPC Architecture First Edition 




participate in the shift with zeros shifted into the 
round bit. The exponent is regarded as if its range 
were unlimited. If the resulting exponent value is less 
than the minimum value that can be represented in 
the format specified for the result, the intermediate 
result is said to be “Tiny” and the stored result is 
determined by the rules described in Section 4.4.4, 
“Underflow Exception” on page 94. The sign of the 
number does not change. 

When an arithmetic operation produces a non-zero 
intermediate result with an exponent value less than 
the minimum value that can be represented in the 
format specified for the result, the stored result is 
determined by the rules described in Section 4.4.4, 
“Underflow Exception” on page 94. This process may 
require denormalization. 

A number is denormalized by shifting its significand 
right while incrementing its exponent by one for each 
bit shifted, until the exponent is equal to the format's 
minimum value. If any significant bits are lost in this 
shifting process then “Loss of Accuracy” has occurred 
(See Section 4.4.4, “Underflow Exception” on 
page 94) and Underflow Exception is signalled. The 
sign of the number does not change. 

4.3.5 Data Handling and Precision 

Instructions are defined to move floating-point data 
between the FPRs and storage. For double format 
data the data is not altered during the move. For 
single format data, a format conversion from single to 
double is performed when loading from storage into 
an FPR and a format conversion from double to single 
is performed when storing from an FPR to storage. 
No floating-point exceptions are raised during these 
operations. 

All arithmetic operations are performed using 
floating-point double format. 

Floating-point single-precision is obtained with the 
implementation of four types of instruction. 

1. Load Floating-Point Single 

This form of instruction accesses a single¬ 
precision operand in single format in storage, 
converts it to double-precision, and loads it into 
an FPR. No exceptions are detected on the load 
operation. 

2. Round to Floating-Point Single-Precision 

The Floating Round to Single-Precision instruction 
rounds a double-precision operand to single¬ 
precision if the operand is not already in single¬ 
precision range, checking the exponent for 


single-precision range and handling any excep¬ 
tions according to respective enable bits, and 
places that operand into an FPR as a double¬ 
precision operand. For results produced by 
single-precision arithmetic instructions and by 
single-precision loads, this operation does not 
alter the value. 

3. Single-Precision Arithmetic Instructions 

This form of instruction takes operands from the 
FPRs in double format, performs the operation as 
if it produced an intermediate result correct to 
infinite precision and with unbounded range, and 
then coerces this intermediate result to fit in 
single format. Status bits, in the FPSCR and in 
the Condition Register, are set to reflect the 
single-precision result. The result is then con¬ 
verted to double format and placed into an FPR. 
The result lies in the range supported by the 
single format. 

All input values must be representable in single 
format: if they are not, the result placed into the 
target FPR, and the setting of status bits in the 
FPSCR and in the Condition Register (if Rc=1), 
are undefined. 

4. Store Floating-Point Single 

This form of instruction converts a double¬ 
precision operand to single format and stores 
that operand into storage. No exceptions are 
detected on the store operation (the value being 
stored is effectively assumed to be the result of 
an instruction of one of the preceding three 
types). 

When the result of a Load Floating-Point Single, 
Floating Round to Single-Precision, or single-precision 
arithmetic instruction is stored in an FPR, the low- 
order 29 FRACTION bits are zero. 

- Programming Note - 

The Floating Round to Single-Precision instruction 
is provided to allow value conversion from 
double-precision to single-precision with appro¬ 
priate exception checking and rounding. This 
instruction should be used to convert double¬ 
precision floating-point values (produced by 
double-precision load and arithmetic instructions) 
to single-precision values prior to storing them 
into single format storage elements or using them 
as operands for single-precision arithmetic 
instructions. Values produced by single-precision 
load and arithmetic instructions can be stored 
directly, or used directly as operands for single¬ 
precision arithmetic instructions, without pre¬ 
ceding the store, or the arithmetic instruction, by 
a Floating Round to Single-Precision instruction. 
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- Programming Note - 

A single-precision value can be used in double¬ 
precision arithmetic operations. The reverse is 
not necessarily true (it is true only if the double¬ 
precision value is representable in single format). 

Some implementations may execute single¬ 
precision arithmetic instructions faster than 
double-precision arithmetic instructions. There¬ 
fore, if double-precision accuracy is not required, 
single-precision data and instructions should be 
used. 


4.3.6 Rounding 

With the exception of the two optional Estimate 
instructions, Floating Reciprocal Estimate Single and 
Floating Reciprocal Square Root Estimate, all arith¬ 
metic instructions defined by this architecture 
produce an intermediate result that can be regarded 
as being infinitely precise. This result must then be 
written with a precision of finite length into an FPR. 
After normalization or denormalization, if the infinitely 
precise intermediate result is not representable in the 
precision required by the instruction then it is 
rounded before being placed into the target FPR. 

The instructions that potentially round their result are 
the Arithmetic, Multiply-Add, and Rounding and Com 
version instructions. For a given instance of one of 
these instructions, whether rounding occurs depends 
on the values of the inputs. Each of these instructions 
sets FPSCR bits FR and FI, according to whether 
rounding occurred (FI) and whether the fraction was 
incremented (FR). If rounding occurred, FI is set to 
one, and FR may be set to either zero or one. If 
rounding did not occur, both FR and FI are set to 
zero. 

The two Estimate instructions set FR and FI to unde¬ 
fined values. The remaining Floating-Point 
instructions do not alter FR and FI. 

Four modes of rounding are provided which are user- 
selectable through the Floating-Point Rounding 
Control field in the FPSCR. See Section 4.2.2, 
“Floating-Point Status and Control Register” on 
page 84. These are encoded as follows: 

RN Rounding Mode 
00 Round to Nearest 

01 Round toward Zero 

10 Round toward -t-lnfinity 

11 Round toward —Infinity 

Let Z be the infinitely precise intermediate arithmetic 
result or the operand of a convert operation. If Z can 
be represented exactly in the target format, then no 
rounding occurs, and the result in all rounding modes 
is equivalent to truncation of Z. If Z cannot be 
represented exactly in the target format, let Z1 and 


Z2 be the next larger and next smaller numbers 
representable in the target format that bound Z, then 
Z1 or Z2 can be used to approximate the result in the 
target format. 

Figure 30 shows the relation of Z, Z1, and Z2 in this 
case. The following rules specify the rounding in the 
four modes. “LSB” means “least significant bit.” 
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Figure 30. Selection of Z1 and Z2 
Round to Nearest 

Choose the best approximation of Z1 or Z2. In 
case of a tie, choose the one which is even (least 
significant bit 0). 

Round toward Zero 

Choose the smaller in magnitude (Z1 or Z2). 

Round toward +lnfinity 
Choose Z1. 

Round toward —Infinity 

Choose Z2. 

See Section 4.5.1, “Execution Model for IEEE 
Operations” on page 96 for a detailed explanation of 
rounding. 

If Z is to be rounded up and Z1 does not exist (i.e., if 
there is no number larger than Z that is representable 
in the target format), then an Overflow Exception 
occurs if Z is positive and an Underflow Exception 
occurs if Z is negative. Similarly, if Z is to be 
rounded down and Z2 does not exist, then an Over¬ 
flow Exception occurs if Z is negative and an Under¬ 
flow Exception occurs if Z is positive. The results in 
these cases are defined in Section 4.4, “Floating-Point 
Exceptions” on page 90. 


4.4 Floating-Point Exceptions 

This architecture defines the following floating-point 
exceptions: 

■ Invalid Operation Exception 
SNaN 

Infinity—Infinity 
lnfinity-i-lnfinity 
Zero-r-Zero 
InfinityxZero 
Invalid Compare 
Software Request 
Invalid Square Root 
Invalid Integer Convert 
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■ Zero Divide Exception 

■ Overflow Exception 

■ Underflow Exception 

■ Inexact Exception 

These exceptions may occur during execution of 
floating-point arithmetic instructions. In addition, an 
Invalid Operation Exception occurs when a Status and 
Control Register instruction sets FPSCR vxsoft to 1 
(Software Request). An Invalid Square Root opera¬ 
tion can occur only if at least one of the Floating 
Square Root instructions defined in Appendix A, 
“Optional Instructions" on page 209, is implemented. 

Each floating-point exception, and each category of 
Invalid Operation Exception, has an exception bit in 
the FPSCR. In addition, each floating-point exception 
has a corresponding enable bit in the FPSCR. The 
exception bit indicates occurrence of the corre¬ 
sponding exception. If an exception occurs, the corre¬ 
sponding enable bit governs the result produced by 
the instruction and, in conjunction with the FEO and 
FE1 bits (see page 92), whether and how the system 
floating-point enabled exception error handler is 
invoked. (In general, the enabling specified by the 
enable bit is of invoking the system error handler, not 
of permitting the exception to occur. The occurence 
of an exception depends only on the instruction and 
its inputs, not on the setting of any control bits. The 
only deviation from this general rule is that the occur¬ 
rence of an Underflow Exception may depend on the 
setting of the enable bit.) 

The Floating-Point Exception Summary bit (FX) in the 
FPSCR is set when any of the exception bits transi¬ 
tions from a zero to a one or when explicitly set by 
software. The Floating-Point Enabled Exception 
Summary bit (FEX) in the FPSCR is set when any of 
the exceptions is set and the exception is enabled 
(enable bit is one). 

A single instruction, other than mtfsfi or mtfsf, may 
set more than one exception in the following cases: 

■ Inexact Exception may be set with Overflow 
Exception. 

■ Inexact Exception may be set with Underflow 
Exception. 

■ Invalid Operation Exception (SNaN) may be set 
with Invalid Operation Exception (ooxO) for 
Multiply-Add instructions. 

■ Invalid Operation Exception (SNaN) may be set 
with Invalid Operation Exception (Invalid 
Compare) for Compare Ordered instructions. 

■ Invalid Operation Exception (SNaN) may be set 
with Invalid Operation Exception (Invalid Integer 
Convert) for Convert to Integer instructions. 

When an exception occurs the instruction execution 
may be suppressed or a result may be delivered, 
depending on the exception. 


Instruction execution is suppressed for the following 
kinds of exception, so that there is no possibility that 
one of the operands is lost. 

■ Enabled Invalid Operation 

■ Enabled Zero Divide 

For the remaining kinds of exception, a result is gen¬ 
erated and written to the destination specified by the 
instruction causing the exception. The result may be 
a different value for the enabled and disabled condi¬ 
tions for some of these exceptions. The kinds of 
exception that deliver a result are the following. 

■ Disabled Invalid Operation 

■ Disabled Zero Divide 

■ Disabled Overflow 

■ Disabled Underflow 

■ Disabled Inexact 

■ Enabled Overflow 

■ Enabled Underflow 

■ Enabled Inexact 

Subsequent sections define each of the floating-point 
exceptions and specify the action that is taken when 
they are detected. 

The IEEE standard specifies the handling of excep¬ 
tional conditions in terms of “traps” and “trap han¬ 
dlers.” In this architecture, an FPSCR exception 
enable bit of 1 causes generation of the result value 
specified in the IEEE standard for the “trap enabled” 
case: the expectation is that the exception will be 
detected by software, which will revise the result. An 
FPSCR exception enable bit of 0 causes generation of 
the “default result” value specified for the “trap disa¬ 
bled” (or “no trap occurs” or “trap is not imple¬ 
mented”) case: the expectation is that the exception 
will not be detected by software, which will simply use 
the default result. The result to be delivered in each 
case for each exception is described in the sections 
below. 

The IEEE default behavior when an exception occurs 
is to generate a default value and not to notify soft¬ 
ware. In this architecture, if the IEEE default behavior 
when an exception occurs is desired for all excep¬ 
tions, all FPSCR exception enable bits should be set 
to 0 and Ignore Exceptions Mode (see below) should 
be used. In this case the system floating-point 
enabled exception error handler is not invoked, even 
if floating-point exceptions occur: software can inspect 
the FPSCR exception bits if necessary, to determine 
whether exceptions have occurred. 

In this architecture, if software is to be notified that a 
given kind of exception has occurred, the corre¬ 
sponding FPSCR exception enable bit must be set to 1 
and a mode other than Ignore Exceptions Mode must 
be used. In this case the system floating-point 
enabled exception error handler is invoked if an 
enabled floating-point exception occurs. 
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Whether and how the system floating-point enabled 
exception error handler is invoked if an enabled 
floating-point exception occurs is controlled by the 
FEO and FE1 bits. The location of these bits and the 
requirements for altering them are described in 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141. (The system floating¬ 
point enabled exception error handler is never 
invoked because of a disabled floating-point excep¬ 
tion.) The effects of the four possible settings of 
these bits are as follows. 

FEO FE1 Description 

0 0 Ignore Exceptions Mode 

Floating-point exceptions do not cause the 
system floating-point enabled exception 
error handler to be invoked. 

0 1 Imprecise Nonrecoverable Mode 

The system floating-point enabled exception 
error handler is invoked at some point at or 
beyond the instruction that caused the 
enabled exception. It may not be possible to 
identify the excepting instruction nor the 
data that caused the exception. Results 
produced by the excepting instruction may 
have been used by or may have affected 
subsequent instructions that are executed 
before the error handler is invoked. 

1 0 Imprecise Recoverable Mode 

The system floating-point enabled exception 
error handler is invoked at some point at or 
beyond the instruction that caused the 
enabled exception. Sufficient information is 
provided to the error handler that it can 
identify the excepting instruction and the 
operands, and correct the result. No results 
produced by the excepting instruction have 
been used by or have affected subsequent 
instructions that are executed before the 
error handler is invoked. 

1 1 Precise Mode 

The system floating-point enabled exception 
error handler is invoked precisely at the 
instruction that caused the enabled excep¬ 
tion. 

In all cases the question of whether or not a floating¬ 
point result is stored, and what value is stored, is 
governed by the FPSCR exception enable bits, as 
described in subsequent sections, and is not affected 
by the value of the FEO and FE1 bits. 

In all cases in which the system floating-point enabled 
exception error handler is invoked, all instructions 
before the instruction at which the system floating¬ 
point enabled exception error handler is invoked have 
completed, and no instruction after the instruction at 
which the system floating-point enabled exception 
error handler is invoked has been executed. (Recall 
that, for the two Imprecise modes, the instruction at 


which the system floating-point enabled exception 
error handler is invoked need not be the instruction 
that caused the exception.) The instruction at which 
the system floating-point enabled exception error 
handier is invoked has not been executed, unless it is 
the excepting instruction, in which case it has been 
executed unless the kind of exception is among those 
listed above as suppressed. 


In order to obtain the best performance across the 
widest range of implementations, the programmer 
should obey the following guidelines. 

■ If the IEEE default results are acceptable to the 
application. Ignore Exceptions Mode should be 
used, with all FPSCR exception enable bits set to 
0 . 

■ If the IEEE default results are not acceptable to 
the application, Imprecise Non-Recoverable Mode 
should be used, or Imprecise Recoverable Mode 
if recoverability is needed, with FPSCR exception 
enable bits set to 1 for those exceptions for which 
the system floating-point enabled exception error 
handler is to be invoked. 

■ Ignore Exceptions Mode should not, in general, be 
used when any FPSCR exception enable bits are 
set to 1. 

■ Precise Mode may degrade performance in some 
implementations, perhaps substantially, and 
therefore should be used only for debugging and 
other specialized applications. 


— Programming Note - 

In any of the three non-Precise modes, a Floating- 
Point Status and Control Register instruction can 
be used to force any exceptions, due to 
instructions initiated before the Floating-Point 
Status and Control Register instruction, to be 
recorded in the FPSCR. (This forcing is super¬ 
fluous for Precise Mode.) 

In either of the Imprecise modes, a Floating-Point 
Status and Control Register instruction can be 
used to force any invocations of the system 
floating-point enabled exception error handler, 
due to instructions initiated before the Floating- 
Point Status and Control Register instruction, to 
occur. (This forcing has no effect in ignore Excep¬ 
tions Mode, and is superfluous for Precise Mode.) 

A sync instruction, or any other execution syn¬ 
chronizing instruction or event (e.g., isync: see 
Part 2, “PowerPC Virtual Environment 
Architecture” on page 117), also has the effects 
described above. However, in order to obtain the 
best performance across the widest range of 
implementations, a Floating-Point Status and 
Control Register instruction should be used to 
obtain these effects. 
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4.4.1 Invalid Operation Exception 


4.4.1.1 Definition 

An Invalid Operation Exception occurs whenever an 
operand is invalid for the specified operation. The 
invalid operations are: 

■ Any operation, except Load, Store, Move, Select, 
and mtfsf, on a signalling NaN (SNaN) 

■ For add or subtract operations, magnitude sub¬ 
traction of infinities (oo—oo) 

■ Division of infinity by infinity (oo-roo) 

■ Division of zero by zero (O-fO) 

■ Multiplication of infinity by zero (ooxO) 

■ Ordered comparison involving a NaN (Invalid 
Compare) 

■ Square root or reciprocal square root of a nega¬ 
tive (and non-zero) number (Invalid Square Root) 

■ Integer convert involving a large number, an 
infinity, or a NaN (Invalid Integer Convert) 

In addition, an Invalid Operation Exception occurs if 
software explicitly requests this by executing a mtfsfi, 
mtfsf, or mtfsbl instruction that sets FPSCRvx S0FT to 
1 (Software Request). An Invalid Square Root opera¬ 
tion can occur only if at least one of the Floating 
Square Root instructions defined in Appendix A, 
“Optional Instructions” on page 209, is implemented. 

- Programming Note - 

The purpose of FPSCRvxsoft is to allow software 
to cause an Invalid Operation Exception for a con¬ 
dition that is not necessarily associated with the 
execution of a floating-point instruction. For 
example, it might be set by a program that com¬ 
putes a square root, if the source operand is neg¬ 
ative. 


4.4.1.2 Action 

The action to be taken depends on the setting of the 
Invalid Operation Exception Enable bit of the FPSCR. 

When Invalid Operation Exception is enabled 
(FPSCR ve = 1) and Invalid Operation occurs or soft¬ 
ware explicitly requests the exception then the fol¬ 
lowing actions are taken: 

1. One or two Invalid Operation Exceptions is set 


FPSCR vxsnan 

(if SNaN) 

FPSCRyxisi 

(if oo—oo) 

fpscr VX(D , 

(if 00-r00) 

FPSCR vxzdz 

(if O-r-O) 

fpscr vximz 

(if ooxO) 

FPSCR vxvc 

(if invalid comp) 

FPSCR vxsoft 

(if software req) 

FPSCR vxsqrt 

(if invalid sqrt) 


FPSCRyxcvi (if invalid int cvrt) 

2. If the operation is an arithmetic, Floating Round 
to Single-Precision, or convert to integer opera¬ 
tion, 

the target FPR is unchanged 
FPSCR fr Ft are set to zero 
FPSCR fprf is unchanged 

3. If the operation is a compare, 

FPSCR fr F | c are unchanged 
FPSCR fpcc is set to reflect unordered 

4. If software explicitly requests the exception, 

FPSCR fr f , fprf are as set by the mtfsfi, 
mtfsf, or mtfsbl instruction 

When Invalid Operation Exception is disabled 
(FPSCR ve = 0) and Invalid Operation occurs or soft¬ 
ware explicitly requests the exception then the fol¬ 
lowing actions are taken: 

1. One or two Invalid Operation Exceptions is set 


FPSCRvxsnan 

(if SNaN) 

FPSCRvxjs, 

(if oo —oo) 

FPSCRvxjQi 

(if oo-roo) 

FPSCRvx2dz 

(if O-rO) 

FPSCRvximz 

(if ooxO) 

FPSCRvxvq 

(if invalid comp) 

FPSCRvxsoft 

(if software req) 

FPSCRvxsqrt 

(if invalid sqrt) 

FPSCRvxcvi 

(if invalid int cvrt) 


2. If the operation is an arithmetic or Floating 
Round to Single-Precision operation 

the target FPR is set to a Quiet NaN 
FPSCR fr f , are set to zero 
FPSCR fprf is set to indicate the class of the 
result (Quiet NaN) 

3. If the operation is a convert to 32-bit integer 
operation, 

the target FPR is set as follows: 

FRTq ;31 +- undefined 

FRT 32 63 are set to the most positive 
32-bit integer if the operand in FRB is a 
positive number or +oo, and to the most 
negative 32-bit integer if the operand in 
FRB is a negative number, —oo, or NaN 
FPSCR fr f , are set to zero 
FPSCR fprf is undefined 

4. If the operation is a convert to 64-bit integer 
operation, 

the target FPR is set as follows: 

FRT is set to the most positive 64-bit 
integer if the operand in FRB is a posi¬ 
tive number or +oo, and to the most 
negative 64-bit integer if the operand in 
FRB is a negative number, —oo, or NaN 
FPSCR fr F | are set to zero 
FPSCR fprf is undefined 

5. If the operation is a compare, 

FPSCR fr Ft c are unchanged 
FPSCR fpcc is set to reflect unordered 

6 . If software explicitly requests the exception, 

FPSCR fr F( fprf are as set by the mtfsfi, 
mtfsf, or mtfsbl instruction 


Chapter 4. Floating-Point Processor 93 




4.4.2 Zero Divide Exception 

4.4.2.1 Definition 

A Zero Divide Exception occurs when a Divide instruc¬ 
tion is executed with a zero divisor value and a finite 
non-zero dividend value. It also occurs when a Recip¬ 
rocal Estimate instruction (fres or frsqrte) is executed 
with an operand value of zero. 

4.4.2.2 Action 

The action to be taken depends on the setting of the 
Zero Divide Exception Enable bit of the FPSCR. 

When Zero Divide Exception is enabled (FPSCR 2E = 1) 
and Zero Divide occurs then the following actions are 
taken: 

1. Zero Divide Exception is set 

FPSCRzx <- 1 

2. The target FPR is unchanged 

3. FPSCR fr f , are set to zero 

4. FPSCR fprf is unchanged 

When Zero Divide Exception is disabled (FPSCR 2E = 0) 
and Zero Divide occurs then the following actions are 
taken: 

1. Zero Divide Exception is set 

FPSCRzx <- 1 

2. The target FPR is set to a +lnfinity, where the 
sign is determined by the XOR of the signs of the 
operands 

3. FPSCR fr F | are set to zero 

4. FPSCR fprf is set to indicate the class and sign of 
the result (+lnfinity) 

4.4.3 Overflow Exception 

4.4.3.1 Definition 

Overflow occurs when the magnitude of what would 
have been the rounded result if the exponent range 
were unbounded exceeds that of the largest finite 
number of the specified result precision. 

4.4.3.2 Action 

The action to be taken depends on the setting of the 
Overflow Exception Enable bit of the FPSCR. 

When Overflow Exception is enabled (FPSCR 0 E =1) 
and exponent overflow occurs then the following 
actions are taken: 

1. Overflow Exception is set 

FPSCR ox 4 - 1 

2. For double-precision arithmetic instructions, the 
exponent of the normalized intermediate result is 
adjusted by subtracting 1536 


3. For single-precision arithmetic instructions and 
the Floating Round to Single-Precision instruc¬ 
tion, the exponent of the normalized intermediate 
result is adjusted by subtracting 192 

4. The adjusted rounded result is placed into the 
target FPR 

5. FPSCR fprf is set to indicate the class and sign of 
the result (+Normal Number) 

When Overflow Exception is disabled (FPSCR OE = 0) 
and overflow occurs then the following actions are 
taken: 

1. Overflow Exception is set 

FPSCR ox - 1 

2. Inexact Exception is set 

FPSCRxx «- 1 

3. The result is determined by the rounding mode 
(FPSCR rn ) and the sign of the intermediate result 
as follows: 

A. Round to Nearest 

Store + Infinity, where the sign is the sign of 
the intermediate result 

B. Round toward Zero 

Store the format's largest finite number with 
the sign of the intermediate result 

C. Round toward -Hnfinity 

For negative overflow, store the format's 
most negative finite number; for positive 
overflow, store H-Infinity 

D. Round toward —Infinity 

For negative overflow, store —Infinity; for 
positive overflow, store the format's largest 
finite number 

4. The result is placed into the target FPR 

5. FPSCR fr is undefined 

6. FPSCRpi is set to one 

7. FPSCR fprf is set to indicate the class and sign of 
the result (^Infinity or +Normal Number) 

4.4.4 Underflow Exception 

4.4.4.1 Definition 

Underflow Exception is defined separately for the 
enabled and disabled states: 

■ Enabled: 

Underflow occurs when the intermediate result is 
“Tiny.” 

■ Disabled: 

Underflow occurs when the intermediate result is 
“Tiny” and there is “Loss of Accuracy.” 

A “Tiny" result is detected before rounding, when a 
non-zero result value computed as though the expo¬ 
nent range were unbounded would be less in magni¬ 
tude than the smallest normalized number. 

If the intermediate result is “Tiny” and the Underflow 
Exception Enable is off (FPSCR UE =0) then the inter¬ 
mediate result is denormalized (Section 4.3.4, “Nor¬ 
malization and Denormalization” on page 88) and 
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rounded (Section 4.3.6, “Rounding” on page 90) 
before being placed into the target FPR. 

“Loss of Accuracy” is detected when the delivered 
result value differs from what would have been com¬ 
puted were both the exponent range and precision 
unbounded. 

4.4.4.2 Action 

The action to be taken depends on the setting of the 
Underflow Exception Enable bit of the FPSCR. 


4.4.5 Inexact Exception 

4.4.5.1 Definition 

Inexact Exception occurs when one of two conditions 
occur during rounding: 

1. The rounded result differs from the intermediate 
result assuming the intermediate result exponent 
range and precision to be unbounded. 

2. The rounded result overflows and Overflow 
Exception is disabled. 


When Underflow Exception is enabled (FPSCR UE = 1) 
and exponent underflow occurs then the following 
actions are taken: 

1. Underflow Exception is set 

FPSCR UX +- 1 

2. For double-precision arithmetic and conversion 
instructions, the exponent of the normalized inter¬ 
mediate result is adjusted by adding 1536 

3. For single-precision arithmetic instructions and 
the Floating Round to Single-Precision instruc¬ 
tion, the exponent of the normalized intermediate 
result is adjusted by adding 192 

4. The adjusted rounded result is placed into the 
target FPR 

5. FPSCR fprf is set to indicate the class and sign of 
the result (+Normalized Number) 

- Programming Note -- 

The FR and FI bits are provided to allow the 
system floating-point enabled exception error 
handler, when invoked because of an Underflow 
Exception, to simulate a “trap disabled” environ¬ 
ment. That is, the FR and FI bits allow the system 
floating-point enabled exception error handler to 
unround the result, thus allowing the result to be 
denormalized. 


When Underflow Exception is disabled (FPSCR UE = 0) 
and underflow occurs then the following actions are 
taken: 

1. Underflow Exception is set 

FPSCR UX <- 1 

2. The rounded result is placed into the target FPR 

3. FPSCR fprf is set to indicate the class and sign of 
the result (+Denormalized Number or +Zero) 


4.4.5.2 Action 

The action to be taken does not depend on the setting 
of the Inexact Exception Enable bit of the FPSCR. 

When Inexact Exception occurs then the following 
actions are taken: 

1. Inexact Exception is set 

FPSCRxx 1 

2. The rounded or overflowed result is placed into 
the target FPR 

3. FPSCR fprf is set to indicate the class and sign of 
the result 

- Programming Note - 

In some implementations, enabling Inexact Excep¬ 
tions may degrade performance more than ena¬ 
bling other types of floating-point exception. 


4.5 Floating-Point Execution 
Models 

All implementations of this architecture must provide 
the equivalent of the following execution models to 
insure that identical results are obtained. 

Special rules are provided in the definition of the 
arithmetic instructions for the infinities, denormalized 
numbers and NaNs. 

Although the double format specifies an 11-bit expo¬ 
nent, exponent arithmetic makes use of two additional 
bit positions to avoid potential transient overflow con¬ 
ditions. One extra bit is required when denormalized 
double-precision numbers are prenormalized. The 
second bit is required to permit the computation of 
the adjusted exponent value in the following cases 
when the corresponding exception enable bits is one: 

■ Underflow during multiplication using a denormal¬ 
ized factor. 

■ Overflow during division using a denormalized 
divisor. 
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The IEEE standard includes 32-bit and 64-bit arith¬ 
metic. The standard requires that single-precision 
arithmetic be provided for single-precision operands. 
The standard permits double-precision arithmetic 
instructions to have either (or both) single-precision 
or double-precision operands, but states that single¬ 
precision arithmetic instructions should not accept 
double-precision operands. The PowerPC Architecture 
follows these guidelines: double-precision arithmetic 
instructions can have operands of either or both pre¬ 
cisions, while single-precision arithmetic instructions 
require all operands to be single-precision. Double¬ 
precision arithmetic instructions produce double¬ 
precision values, while single-precision arithmetic 
instructions produce single-precision values. 

For arithmetic instructions, conversions from double¬ 
precision to single-precision must be done explicitly 
by software, while conversions from single-precision 
to double-precision are done implicitly. 

4.5.1 Execution Model for IEEE 
Operations 

The following description uses 64-bit arithmetic as an 
example. 32-bit arithmetic is similar except that the 
FRACTION is a 23-bit field, and the single-precision 
Guard, Round, and Sticky bits (described in this 
section) are logically adjacent to the 23-bit FRACTION 
field. 

lEEE-conforming significand arithmetic is considered 
to be performed with a floating-point accumulator 
having the following format: 


s 

c 

L 

FRACTION 

G 

— 

R 

X 


0 1 52 


Figure 31. IEEE 64-bit Execution Model 
The S bit is the sign bit. 

The C bit is the carry bit that captures the carry out of 
the significand. 

The L bit is the leading unit bit of the significand 
which receives the implicit bit from the operands. 

The FRACTION is a 52-bit field which accepts the frac¬ 
tion of the operands. 

The Guard (G), Round (R), and Sticky (X) bits are 
extensions to the low order bits of the accumulator. 
The G and R bits are required for post normalization 
of the result. The G, R, and X bits are required during 
rounding to determine if the intermediate result is 
equally near the two nearest representable values. 
The X bit serves as an extension to the G and R bits 
by representing the logical OR of all bits which may 
appear to the low-order side of the R bit, either due to 
shifting the accumulator right or other generation of 


low-order result bits. The G and R bits participate in 
the left shifts with zeros being shifted into the R bit. 
Figure 32 shows the significance of the G, R, and X 
bits with respect to the intermediate result (IR), the 
next lower in magnitude representable number (NL), 
and the next higher in magnitude representable 
number (NH). 


G 

R 

X 

Interpretation 

0 

0 

0 

IR is exact 

0 

0 

1 


0 

1 

0 

IR closer to NL 

0 

1 

1 


1 

0 

0 

IR midway between NL & NH 

1 

0 

1 


1 

1 

0 

IR closer to NH 

1 

1 

1 



Figure 32. Interpretation of G, R, and X bits 


The significand of the intermediate result is made up 
of the L bit, the FRACTION, and the G,R and X bits. 

The infinitely precise intermediate result of an opera¬ 
tion is the result normalized in bits L, FRACTION, G, 
R, and X of the floating-point accumulator. 

Before the results are stored into an FPR, the 
significand is rounded if necessary, using the 
rounding mode specified by FPSCR rn . If rounding 
results in a carry into C, the significand is shifted right 
one position and the exponent incremented by one. 
This yields an inexact result and possibly also expo¬ 
nent overflow. Fraction bits to the left of the bit posi¬ 
tion used for rounding are stored into the FPR and 
low-order bit positions, if any, are set to zero. 

Four rounding modes are provided which are user- 
selectable through FPSCR rn as decribed in Section 
4.3.6, “Rounding” on page 90. For rounding, the con¬ 
ceptual Guard, Round, and Sticky bits are defined in 
terms of accumulator bits. Figure 33 shows the posi¬ 
tions of the Guard, Round, and Sticky bits for double¬ 
precision and single-precision floating-point numbers. 


Format 

Guard 

Round 

Sticky 

Double 

Single 

G bit 

24 

R bit 

25 

X bit 

26:52 G,R,X 


Figure 33. Location of the Guard, Round and Sticky 
Bits 

Rounding can be treated as though the significand 
were shifted right, if required, until the least signif¬ 
icant bit to be retained is in the low-order bit position 
of the FRACTION. If any of the Guard, Round, or 
Sticky bits is non-zero, then the result is inexact. 
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Z1 and Z2, as defined on page 90, can be used to 
approximate the result in the target format when one 
of the following rules is used. 

■ Round to Nearest 
Guard bit = 0 

The result is truncated. (Result exact (GRX = 
000 ) or closest to next lower value in magni¬ 
tude (GRX = 001, 010, or Oil)) 

Guard bit = 1 

Depends on Round and Sticky bits: 

Case a 

If the Round or Sticky bit is one (inclu¬ 
sive), the result is incremented. (Result 
closest to next higher value in magitude 
(GRX = 101, 110, or 111)) 

Case b 

If the Round and Sticky bits are zero 
(result midway between closest repre¬ 
sentable values) then if the low-order bit 
of the result is one the result is incre¬ 
mented. Otherwise (the low-order bit of 
the result is zero) the result is truncated 
(this is the case of a tie rounded to 
even). 

If during the Round to Nearest process, trun¬ 
cation of the unrounded number would 
produce the maximum magnitude for the 
specified precision, then the following action 
is taken: 

Guard bit = 1 

Store infinity with the sign of the 
unrounded result. 

Guard bit = 0 

Store the truncated (maximum magni¬ 
tude) value. 

■ Round toward Zero 

Choose the smaller in magnitude of Z1 or Z2. 
See “Rounding” on page 90 for the definitions of 
Z1 and Z2. If Guard, Round, or Sticky bit is non¬ 
zero, the result is inexact. 

■ Round toward +Infinity 

Choose Z1. See “Rounding” on page 90 for the 
definition of Z1. 

■ Round toward —Infinity 

Choose Z2. See “Rounding” on page 90 for the 
definition of Z2. 

Where the result is to have fewer than 53 bits of pre¬ 
cision because the instruction is a Floating Round to 
Single-Precision or single-precision arithmetic instruc¬ 
tion, the intermediate result either is normalized or is 
placed in correct denormalized form before the result 
is potentially rounded. 


4.5.2 Execution Model for 
Multiply-Add Type Instructions 

The PowerPC Architecture makes use of a special 
form of instruction which performs up to three oper¬ 
ations in one instruction (a multiply, an add and a 
negate). With this added capability is the special 
feature of being able to produce a more exact inter¬ 
mediate result as an input to the rounder. 32-bit 
arithmetic is similar except that the FRACTION field is 
smaller. 

The multiply-add operations produce intermediate 
results conforming to the following model: 


0 

1 

0 

FRACTION 

X' 



0 

1 

105 


Figure 34. Multiply-Add Execution Model 

The first part of the operation is a multiply. The mul¬ 
tiply has two 53-bit significands as inputs, which are 
assumed to be prenormalized, and produces a result 
conforming to the above model. If there is a carry 
out of the significand (into the C bit), then the 
significand is shifted right one position, shifting the L 
bit (leading unit bit) into the most significant bit of the 
fraction and shifting the C bit (carry out) into the L bit. 
All 106 bits (L bit, the fraction) of the product take 
part in the add operation. If the exponents of the two 
inputs to the adder are not equal, the significand of 
the operand with the smaller exponent is aligned 
(shifted) to the right by an amount which is added to 
that exponent to make it equal to the other input's 
exponent. Zeros are shifted into the left of the 
significand as it is aligned and bits shifted out of bit 
105 of the significand are ORed into the X' bit. The 
add operation also produces a result conforming to 
the above model with the X' bit taking part in the add 
operation. 

The result of the add is then normalized, with all bits 
of the add result, except the X' bit, participating in the 
shift. The normalized result provides an intermediate 
result as input to the rounder which conforms to the 
model described in Section 4.5.1, “Execution Model 
for IEEE Operations” on page 96, where: 

■ The Guard bit is bit 53 of the intermediate result. 

■ The Round bit is bit 54 of the intermediate result. 

■ The Sticky bit is the OR of all remaining bits to 
the right of bit 55, inclusive. 

The rules of rounding the intermediate result are the 
same as the described in Section 4.5.1, “Execution 
Model for IEEE Operations” on page 96. 

If the instruction is Floating Negative Multiply-Add or 
Floating Negative Multiply-Subtract the final result is 
negated. 
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Status bits are set to reflect the result of the entire 
operation: e.g., no status is recorded for the result of 
the multiplication part of the operation. 
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4.6 Floating-Point Processor Instructions 

4.6.1 Floating-Point Storage Access Instructions 


The Storage Access instructions compute the effective 
address (EA) of the storage to he accessed as 
described in Section 1.11.2, “Effective Address 
Calculation” on page 15. 

The order of bytes accessed by floating-point loads 
and stores is Big-Endian, unless Little-Endian storage 
ordering is selected as described in Appendix D, 
“Little-Endian Byte Ordering” on page 235. 


- Programming Note - 

The “la” extended mnemonic permits computing 
an Effective Address as a Load or Store instruc¬ 
tion would, but loads the address itself into a GPR 
rather than loading the value that is in storage at 
that address. This extended mnemonic is 
described in “Load Address” on page 234. 


4.6.1.1 Storage Access Exceptions 

Storage accesses will cause the system error handler 
to be invoked if the program is not allowed to modify 
the target storage (Store only), or if the program 
attempts to access storage that is unavailable. 


4.6.2 Floating-Point Load Instructions 

There are two basic forms of load instruction, single¬ 
precision and double-precision. Because the FPRs 
support only floating-point double format, single¬ 
precision Load Floating-Point instructions convert 
single-precision data to double format prior to loading 
the operands into the target FPR. The conversion and 
loading steps are as follows: 

Let WORD 031 be the floating-point single-precision 
operand accessed from storage. 

Normalized Operand 
if WORD 18 > 0 and WORD 18 < 255 then 
FRT 01 «- WORD 01 
FRT 2 - -WORD, 

FRT 3 -WORD! 

FRT 4 -’WORD 1 
FRT 5;63 <- WORD 2;31 || 29 0 

Denormalized Operand 
if WORD 18 = 0 and WORD 9 31 0 then 

sign 1- WORD 0 
exp <-126 

frac 0 5 2 - ObO || WORD 9:31 || 29 0 
normalize the operand 
Do while frac 0 = 0 
frac <- frac! 52 || ObO 
exp «- exp — 1 

End 

FRT 0 <- sign 
FRT 1;1 ! «- exp + 1023 
FRTi2;g3 <- frac 1:5 2 


For double-precision Load Floating-Point instructions, 
no conversion is required as the data from storage is 
copied directly into the FPR. 

Many of the Load Floating-Point instructions have an 
“update” form, in which register RA is updated with 
the effective address. For these forms, if RA^=0, the 
effective address is placed into register RA and the 
storage element (word or doubleword) addressed by 
EA is loaded into FRT. 

Note: Recall that RA, RB, and RT denote General 
Purpose Registers, while FRA, FRB, FRC and FRT 
denote Floating-Point Registers. 

Byte order of PowerPC is Big-Endian by default; see 
Appendix D, “Little-Endian Byte Ordering” on 
page 235 for PowerPC systems operated with Little- 
Endian byte ordering. 


Zero I Infinity / NaN 

if WORD 1 8 = 255 or WORDi 3 ! = 0 then 
FRT 0M <- WORD 0 i 
FRT 2 <-WORD! 

FRT 3 4 - WORD t 
FRT 4 4 - WORD, 
frt 563 - WORD 23 , II 29 0 
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Load Floating-Point Single D-form 


Ifs FRT,D(RA) 



if RA = 0 then b «- 0 
else b f (RA) 

EA b + EXTS(D) 

FRT <- DOUBLE(MEM(EA, 4)) 

Let the effective address (EA) be the sum (RA|0) + D. 

The word in storage addressed by EA is interpreted 
as a floating-point single-precision operand. This 
word is converted to floating-point double format (see 
page 99) and placed into register FRT. 

Special Registers Altered: 

None 


Load Floating-Point Single Indexed 
X-form 

Ifsx FRT,RA,RB 


31 FRT RA RB 535 / 



if RA = 0 then b «- 0 
else b <- (RA) 

EA «- b + (RB) 

FRT <- DOUBLE(MEM(EA, 4)) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The word in storage addressed by EA is interpreted 
as a floating-point single-precision operand. This 
word is converted to floating-point double format (see 
page 99) and placed into register FRT. 

Special Registers Altered: 

None 


Load Floating-Point Single with Update 
D-form 

Ifsu FRT,D(RA) 

49 FRT RA D 

!o 6 11 16 31 


EA *• (RA) + EXTS(D) 

FRT <- DOUBLE(MEM(EA, 4)) 

RA <- EA 

Let the effective address (EA) be the sum (RA) + D. 

The word in storage addressed by EA is interpreted 
as a floating-point single-precision operand. This 
word is converted to floating-point double format (see 
page 99) and placed into register FRT. 

EA is placed into register FRA. 


Load Floating-Point Single with Update 
Indexed X-form 

Ifsux FRT.RA.RB 

31 FRT RA RB 567 / 

0_ 6_11_16_ 21_31^ 

EA «- (RA) + (RB) 

FRT «- DOUBLE(MEM(EA, 4)) 

RA <- EA 

Let the effective address (EA) be the sum (FRA) + (RB). 

The word in storage addressed by EA is interpreted 
as a floating-point single-precision operand. This 
word is converted to floating-point double format (see 
page 99) and placed into register FRT. 

EA is placed into register FRA. 


If RA = 0, the instruction form is invalid. If FRA = 0, the instruction form is invalid. 

Special Registers Altered: Special Registers Altered: 

None None 
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Load Floating-Point Double D-form 


Ifd FRT,D(RA) 


50 

FRT 

RA 


D 


0 

6 


16 


31 


if RA = 0 then b «- 0 
else b *■ (RA) 

EA «- b + EXTS(D) 

FRT 4 - MEM(EA, 8) 

Let the effective address (EA) be the sum (RA|0) + D. 

The doubleword in storage addressed by EA is placed 
into register FRT. 

Special Registers Altered: 

None 


Load Floating-Point Double Indexed 
X-form 


Ifdx FRT,RA,RB 


31 

FRT 

RA 

RB 

599 

/ 

0 

6 

ii 

16 

21 

31 


if RA = 0 then b <- 0 
else b *■ (RA) 

EA <- b + (RB) 

FRT <- MEM(EA, 8 ) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The doubleword in storage addressed by EA is placed 
into register FRT. 

Special Registers Altered: 

None 


Load Floating-Point Double with Update Load Floating-Point Double with Update 
D-form Indexed X-form 


Ifdu FRT,D(RA) Ifdux FRT.RA.RB 


31 

FRT 

RA 

RB 

631 

/ 

0 

6 

ii 

16 

21 

31 


51 

FRT 

RA 


D 


0 

6 

ii 

16 


31 


EA 4- (RA) + EXTS(D) 

FRT 4- MEM(EA, 8 ) 

RA 4 - EA 

Let the effective address (EA) be the sum (RA) + D. 

The doubleword in storage addressed by EA is placed 
into register FRT. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 


EA 4- (RA) + (RB) 

FRT 4- MEM(EA, 8 ) 

RA 4 - EA 

Let the effective address (EA) be the sum (RA) + (RB). 

The doubleword in storage addressed by EA is placed 
into register FRT. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 
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4.6.3 Floating-Point Store Instructions 


There are three basic forms of store instruction, 
single-precision, double-precision, and integer. The 
integer form is provided by the optional Store 
Floating-Point as Integer Word instruction, described 
on page 210. Because the FPRs support only floating¬ 
point double format for floating-point data, single¬ 
precision Store Floating-Point instructions convert 
double-precision data to single format prior to storing 
the operands into storage. The conversion steps are 
as follows: 

Let WORD 0:31 be the word in storage written to. 

) 

No Denormalization Required (includes Zero / Infinity 
I NaN) 

if FRS V11 > 896 or FRS 1S3 = 0 then 
WORD 0 . i - FRSq.] 

WORD 2:31 - FRS 5.34 

Denormalization Required 
if 874 < FRS 1;11 < 896 then 
sign <- FRS 0 
exp «- FRS 1;11 — 1023 
frac +- Obi |j FRS 1263 
Denormalize operand 
Do while exp < —126 
frac +- ObO || frac 0:62 
exp «- exp + 1 
End 

WORD 0 *- sign 
WORD 1;8 4 - 0x00 
WORD 9:31 frac 1;23 
else WORD undefined 

Notice that if the value to be stored by a single- 
precision Store Floating-Point instruction is larger in 
magnitude than the maximum number representable 
in single format, the first case above (No Denormal¬ 
ization Required) applies. The result stored in WORD 
is then a well-defined value, but is not numerically 
equal to the value in the source register (i.e., the 
result of a single-precision Load Floating-Point from 
WORD will not compare equal to the contents of the 
original source register). 


For double-precision Store Floating-Point instructions 
and for the Store Floating-Point as Integer Word 
instruction, no conversion is required as the data 
from the FPR is copied directly into storage. 

Many of the Store Floating-Point instructions have an 
“update” form, in which register RA is updated with 
the effective address. For these forms, if RA^O, the 
effective address is placed into register RA. 

Note: Recall that RA, RB, and RT denote General 
Purpose Registers, while FRA, FRB, FRC and FRT 
denote Floating-Point Registers. 

Byte order of PowerPC is Big-Endian by default; see 
Appendix D, “Little-Endian Byte Ordering” on 
page 235 for PowerPC systems operated with Little- 
Endian byte ordering. 
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Store Floating-Point Single D-form 


Store Floating-Point Single Indexed 
X-form 


stfs FRS,D(RA) 


stfsx FRS,RA,RB 



if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + EXTS(D) 

MEM(EA, 4) 4- SINGLE(FRS) 

Let the effective address (EA) be the sum (RA|0) + D. 

The contents of register FRS is converted to single 
format (see page 102) and stored into the word in 
storage addressed by EA. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + (RB) 

MEM(EA, 4) <- SINGLE(FRS) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The contents of register FRS is converted to single 
format (see page 102) and stored into the word in 
storage addressed by EA. 


Special Registers Altered: 
None 


Store Floating-Point Single with Update Store Floating-Point Single with Update 
D-form Indexed X-form 

stfsu FRS,D(RA) stfsux FRS.RA.RB 

53 FRS RA D 31 FRS RA RB 695 / 

0 6 11 16 31 0 6 11 16 21 31 


EA «- (RA) + EXTS(D) 

MEM(EA, 4) «- SINGLE(FRS) 

RA 4- EA 

Let the effective address (EA) be the sum (RA) + D. 

The contents of register FRS is converted to single 
format (see page 102) and stored into the word in 
storage addressed by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 


EA (RA) + (RB) 

MEM(EA, 4) 4- SINGLE(FRS) 

RA 4- EA 

Let the effective address (EA) be the sum (F*A) + (RB). 

The contents of register FRS is converted to single 
format (see page 102) and stored into the word in 
storage addressed by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 
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Store Floating-Point Double D-form 


Store Floating-Point Double Indexed 
X-form 


stfd FRS,D(RA) 


stfdx FRS,RA,RB 



if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + EXTS(D) 

MEM(EA, 8) «- (FRS) 

Let the effective address (EA) be the sum (RA|0) + D. 

The contents of register FRS is stored into the 
doubleword in storage addressed by EA. 

Special Registers Altered: 

None 


if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + (RB) 

MEM(EA, 8) «- (FRS) 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The contents of register FRS is stored into the 
doubleword in storage addressed by EA. 


Special Registers Altered: 
None 


Store Floating-Point Double with Update Store Floating-Point Double with Update 
D-form Indexed X-form 


stfdu FRS,D(RA) 


stfdux FRS,RA,RB 


55 

FRS 

RA 


D 


0 

6 

ii 

16 


31 



EA «- (RA) + EXTS(D) 

MEM(EA, 8) «■ (FRS) 

RA «- EA 

Let the effective address (EA) be the sum (RA)+ D. 

The contents of register FRS is stored into the 
doubleword in storage addressed by EA. 

EA is placed into register RA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 


EA <- (RA) + (RB) 

MEM(EA, 8) 4- (FRS) 

RA *■ EA 

Let the effective address (EA) be the sum (F?A) + (RB). 

The contents of register FRS is stored into the 
doubleword in storage addressed by EA. 

EA is placed into register IRA. 

If RA = 0, the instruction form is invalid. 

Special Registers Altered: 

None 
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4.6.4 Floating-Point Move Instructions 


These instructions copy data from one floating-point described for each instruction. These instructions do 

register to another with data modifications as not modify the FPSCR. 


Floating Move Register X-form 


Floating Negate X-form 


fmr FRT, FRB 

(Rc = 0) 

fneg FRT,FRB 

(Rc-0) 

fmr. FRT, FRB 

(Rc-1) 

fneg. FRT, FRB 

(Rc-1) 


63 FRT III FRB 72 Rc 63 FRT III FRB 40 Rc 


0 6 11 16 21 31 0 6 11 16 21 31 


The contents of register FRB is placed into register The contents of register FRB with bit 0 inverted is 
FRT. placed into register FRT. 

Special Registers Altered: Special Registers Altered: 

CR1 (if Rc= 1) CR1 (if Rc= 1) 



(Rc = 0) 

(Rc-1) 

fnabs 

FRT, FRB 


fnabs. 

FRT, FRB 

Rc 



31 

63 

FRT 



The contents of register FRB with bit 0 set to zero is 

placed into register FRT. The contents of register FRB with bit 0 set to one is 

placed into register FRT. 

Special Registers Altered: 

CR1 (ifRc = 1) Special Registers Altered: 

CR1 (if Rc = 1) 
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4.6.5 Floating-Point Arithmetic Instructions 


Floating Add [ Single ] A-form 


fadd FRT.FRA.FRB (Rc = 0) 

fadd. FRT,FRA,FRB (Rc-1) 

[Power mnemonics: fa, fa.] 


63 

FRT 

FRA 

FRB 

III 

21 

Rc 

0 

6 

11 

16 

21 

26 

31 


fadds FRT.FRA.FRB (Rc = 0) 

fadds. FRT.FRA.FRB (Rc = 1) 


59 

FRT 

FRA 

FRB 

III 

21 

Rc 

0 

6 

11 

16 

21 

26 

31 


The floating-point operand in register FRA is added to 
the floating-point operand in register FRB. If the most 
significant bit of the resultant significand is not a one 
the result is normalized. The result is rounded to the 
target precision under control of the Floating-Point 
Rounding Control field RN of the FPSCR and placed 
into register FRT. 

Floating-point addition is based on exponent compar¬ 
ison and addition of the two significands. The expo¬ 
nents of the two operands are compared, and the 
significand accompanying the smaller exponent is 
shifted right, with its exponent increased by one for 
each bit shifted, until the two exponents are equal. 
The two significands are then added algebraically to 
form an intermediate sum. All 53 bits in the 
significand as well as all three guard bits (G, R, and 
X) enter into the computation. 

If a carry occurs, the sum's significand is shifted right 
one bit position and the exponent is increased by one. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve =1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI 

CR1 (if Rc= 1) 


Floating Subtract [S/ng/e] A-form 


fsub FRT.FRA.FRB (Rc = 0) 

fsub. FRT.FRA.FRB (Rc=1) 

[Power mnemonics: fs, fs.] 


63 

FRT 

FRA 

FRB 

III 

20 

Rc 

0 

6 

11 

16 

21 

26 

31 


fsubs FRT.FRA.FRB (Rc = 0) 

fsubs. FRT.FRA.FRB (Rc=1) 


59 

FRT 

FRA 

FRB 

III 

20 

Rc 

0 

6 

ii 

16 

21 

26 

31 


The floating-point operand in register FRB is sub¬ 
tracted from the floating-point operand in register 
FRA. If the most significant bit of the resultant 
significand is not a one the result is normalized. The 
result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the 
FPSCR and placed into register FRT. 

The execution of the Floating Subtract instruction is 
identical to that of Floating Add, except that the con¬ 
tents of FRB participates in the operation with its sign 
bit (bit 0) inverted. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve = 1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI 

CR1 (if Rc= 1) 
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Floating Multiply [S/ng/e] A-form 


fmul FRT,FRA,FRC (Rc = 0) 

fmul. FRT,FRA,FRC (Rc = 1) 

[Power mnemonics: fm, fm.] 


63 

FRT 

FRA 

III 


25 

Rc 

0 

6 

ii 

16 

M 

26 

31 


fmuls FRT,FRA,FRC (Rc = 0) 

fmuls. FRT,FRA,FRC (Rc=1) 


59 

FRT 

FRA 

III 

FRC 

25 

Rc 

0 

6 

ii 

16 

21 

26 

31 


The floating-point operand in register FRA is multi¬ 
plied by the floating-point operand in register FRC. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the FPSCR 
and placed into register FRT. 

Floating-point multiplication is based on exponent 
addition and multiplication of the significands. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve = 1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXIMZ 

CR1 (if Rc = 1) 


Floating Divide [S/ng/e] A-form 


fdiv FRT.FRA.FRB (Rc = 0) 

fdiv. FRT,FRA,FRB (Rc = 1) 

[Power mnemonics: fd, fd.] 


63 

FRT 

FRA 

FRB 

III 

18 

Rc 

0 

6 

ii 

16 

21 

26 

31 


fdivs FRT.FRA.FRB (Rc = 0) 

fdivs. FRT.FRA.FRB (Rc=1) 


59 

FRT 

FRA 

FRB 

III 

18 

Rc 

0 

6 

ii 

16 

21 

26 

31 


The floating-point operand in register FRA is divided 
by the floating-point operand in register FRB. The 
remainder is not supplied as a result. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the FPSCR 
and placed into register FRT. 

Floating-point division is based on exponent sub¬ 
traction and division of the significands. 

FPSCR fprf is set to the class and sign of the result, 
except for invalid Operation Exceptions when 
FPSCR ve =1 and Zero Divide Exceptions when 
FPSCR ze =1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX ZX XX 
VXSNAN VXIDI VXZDZ 

CR1 (if Rc= 1) 
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4.6.6 Floating-Point Multiply-Add Instructions 


These instructions combine a multiply and add opera¬ 
tion without an intermediate rounding operation. The 
fraction part of the intermediate product is 106 bits 


wide, and all 106 bits take part in the add/subtract 
portion of the instruction. 


Floating Multiply-Add [S/ng/e] A-form 


fmadd FRT, FRA, FRC, FRB (Rc = 0) 

fmadd. FRT, FRA, FRC, FRB (Rc=1) 

[Power mnemonics: fma, fma.] 


63 

FRT 

FRA 

FRB 

FRC 

29 

Rc 

0 

6 

11 

16 

21 

26 

31 


fmadds FRT, FRA, FRC, FRB (Rc = 0) 

fmadds. FRT,FRA,FRC,FRB (Rc=1) 


59 

FRT 

FRA 

FRB 

FRC 

29 

Rc 

0 

6 

ii 

16 

21 

26 

31 


The operation 

FRT [(FRA)x(FRC)] + (FRB) 
is performed. 

The floating-point operand in register FRA is multi¬ 
plied by the floating-point operand in register FRC. 
The floating-point operand in register FRB is added to 
this intermediate result. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the FPSCR 
and placed into register FRT. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve =1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CR1 (if Rc = 1) 


Floating Multiply-Subtract [S/ng/e] 
A-form 


fmsub FRT,FRA,FRC,FRB (Rc = 0) 

fmsub. FRT,FRA,FRC,FRB (Rc = 1) 

[Power mnemonics: fms, fms.] 


63 

FRT 

FRA 

FRB 

FRC 

28 

Rc 

0 

6 

ii 

16 

21 

26 

31 


fmsubs FRT, FRA, FRC, FRB (Rc = 0) 

fmsubs. FRT, FRA, FRC, FRB (Rc=1) 


59 

FRT 

FRA 

FRB 

FRC 

28 

Rc 

0 

6 

ii 

16 

21 

26 

31 


The operation 

FRT - [(FRA)x(FRC)] - (FRB) 
is performed. 

The floating-point operand in register FRA is multi¬ 
plied by the floating-point operand in register FRC. 
The floating-point operand in register FRB is sub¬ 
tracted from this intermediate result. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the FPSCR 
and placed into register FRT. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve =1. 


Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CRI (if Rc = 1) 
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Floating Negative Multiply-Add [ Single ] 
A-form 


fnmadd FRT, FRA, FRC, FRB (Rc-O) 

fnmadd. FRT,FRA,FRC,FRB (Rc—1) 

[Power mnemonics: fnma, fnma.] 


63 

FRT 

FRA 

FRB 

FRC 

31 

Rc 

0 

6 

ii 

16 

21 

26 

31 


fnmadds FRT,FRA,FRC,FRB (Rc-O) 

fnmadds. FRT,FRA,FRC,FRB (Rc-1) 


59 

FRT 

FRA 

FRB 

FRC 

31 

Rc 

0 

6 

11 

16 

21 

26 

31 


The operation 

FRT «-( [(FRA)x(FRC)] + (FRB) ) 

is performed. 

The floating-point operand in register FRA is multi¬ 
plied by the floating-point operand in register FRC. 
The floating-point operand in register FRB is added to 
this intermediate result. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the 
FPSCR, then negated and placed into register FRT. 

This instruction produces the same result as would be 
obtained by using the Floating Multiply-Add instruc¬ 
tion and then negating the result, with the following 
exceptions: 

■ ONaNs propagate with no effect on their “sign” 
bit. 

■ ONaNs that are generated as the result of a disa¬ 
bled Invalid Operation Exception have a “sign” bit 
of zero. 

■ SNaNs that are converted to ONaNs as the result 
of a disabled Invalid Operation Exception retain 
the “sign” bit of the SNaN. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve =1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CR1 (if Rc = 1) 


Floating Negative Multiply-Subtract 
[Single] A-form 


fnmsub FRT,FRA,FRC,FRB (Rc-O) 

fnmsub. FRT,FRA,FRC,FRB (Rc-1) 

[Power mnemonics: fnms, fnms.] 


63 

FRT 

FRA 

FRB 

FRC 

30 

Rc 

0 

6 

ii 

16 

21 

26 

31 


fnmsubs FRT, FRA, FRC, FRB (Rc-0) 

fnmsubs. FRT, FRA, FRC, FRB (Rc=1) 


59 

FRT 

FRA 

FRB 

FRC 

30 

Rc 

0 

6 

ii 

16 

21 

26 

31 


The operation 

FRT <-( [(FRA)x(FRC)] - (FRB) ) 

is performed. 

The floating-point operand in register FRA is multi¬ 
plied by the floating-point operand in register FRC. 
The floating-point operand in register FRB is sub¬ 
tracted from this intermediate result. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the 
FPSCR, then negated and placed into register FRT. 

This instruction produces the same result as would be 
obtained by using the Floating Multiply-Subtract 
instruction and then negating the result, with the fol¬ 
lowing exceptions: 

■ ONaNs propagate with no effect on their “sign” 
bit. 

■ ONaNs that are generated as the result of a disa¬ 
bled Invalid Operation Exception have a “sign” bit 
of zero. 

■ SNaNs that are converted to ONaNs as the result 
of a disabled invalid Operation Exception retain 
the “sign” bit of the SNaN. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve = 1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CR1 (if Rc-1) 
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4.6.7 Floating-Point Rounding and Conversion Instructions 


— Programming Note - 

Examples of uses of these instructions to perform 
various conversions can be found in Appendix E.3, 
“Floating-Point Conversions” on page 250. 


Floating Round to Single-Precision 
X-form 


frsp FRT.FRB (Rc-0) 

frsp. FRT.FRB (Rc = 1) 


63 

FRT 

III 

FRB 

12 

Rc 

0 

6 

11 

16 

21 

31 


If it is already in single-precision range, the floating¬ 
point operand in register FRB is placed into register 
FRT. Otherwise the floating-point operand in register 
FRB is rounded to single-precision using the rounding 
mode specified by FPSCR rn and placed into register 
FRT. 

The rounding is described fully in Appendix B.1, 
“Floating-Point Round to Single-Precision Model” on 
page 213. 

FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve =1. 

Special Registers Altered: 

FPRF FR FI 
FX OX UX XX 
VXSNAN 

CR1 (if Rc = 1) 


Floating Convert To Integer Doubleword 
X-form 


fetid FRT.FRB (Rc = 0) 

fetid. FRT.FRB (Rc = 1) 


63 

FRT 

III 

FRB 

814 

Rc 

0 

6 

11 

16 

21 

31 


The floating-point operand in register FRB is con¬ 
verted to a 64-bit signed fixed-point integer, using the 
rounding mode specified by FPSCR rn , and placed into 
register FRT. 

If the operand in FRB is greater than 2 63 — 1, then 
FRT is set to 0x7FFF_FFFF_FFFF_FFFF. If the 
operand in FRB is less than — 2 63 , then FRT is set to 
0x8000_0000_0000_0000. 

The conversion is described fully in Appendix B.2, 
“Floating-Point Convert to Integer Model” on 
page 218. 

Except for enabled Invalid Operation Exceptions, 
FPSCR fprf is undefined. FPSCR fr is set if the result 
is incremented when rounded. FPSCR r is set if the 
result is inexact. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 


Special Registers Altered: 

FPRF (undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CR1 (if Rc = 1) 
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Floating Convert To Integer Doubleword 
with round toward Zero X-form 


fctidz FRT.FRB (Rc-O) 

fctidz. FRT.FRB (Rc = 1) 


63 

FRT 

III 

FRB 

815 

Rc 

0 

6 

11 

16 

21 

31 


The floating-point operand in register FRB is con¬ 
verted to a 64-bit signed fixed-point integer, using the 
rounding mode Round toward Zero, and placed into 
register FRT. 

If the operand in FRB is greater than 2 s3 —1, then 
FRT is set to 0x7FFF_FFFF_FFFF_FFFF. If the 
operand in FRB is less than — 2 63 , then FRT is set to 
0x8000_0000_0000_0000. 

The conversion is described fully in Appendix B.2, 
“Floating-Point Convert to Integer Model” on 
page 218. 

Except for enabled Invalid Operation Exceptions, 
FPSCR fprf is undefined. FPSCR fr is set if the result 
is incremented when rounded. FPSCR fi is set if the 
result is inexact. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

FPRF (undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CR1 (if Rc = 1) 


Floating Convert To Integer Word 
X-form 


fctiw FRT.FRB (Rc-O) 

fctiw. FRT.FRB (Rc-1) 


63 

FRT 

III 

FRB 

14 

Rc 

0 

6 

11 

16 

21 

31 


The floating-point operand in register FRB is con¬ 
verted to a 32-bit signed fixed-point integer, using the 
rounding mode specified by FPSCR rn , and placed in 
bits 32:63 of register FRT. Bits 0:31 of register FRT 
are undefined. 

If the operand in FRB is greater than 2 31 — 1, then bits 
32:63 of FRT are set to 0x7FFF_FFFF. If the operand 
in FRB is less than — 2 31 , then bits 32:63 of FRT are 
set to 0x8000_0000. 

The conversion is described fully in Appendix B.2, 
“Floating-Point Convert to Integer Model” on 
page 218. 

Except for enabled Invalid Operation Exceptions, 
FPSCR fprf is undefined. FPSCR fr is set if the result 
is incremented when rounded. FPSCR F! is set if the 
result is inexact. 

Special Registers Altered: 

FPRF (undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CR1 (if Rc = 1) 
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Floating Convert To Integer Word with 
round toward Zero X-form 


fctiwz FRT.FRB (Rc-O) 

fctiwz. FRT.FRB (Rc-1) 


63 

FRT 

III 

FRB 

15 

Rc 

0 

6 

11 

16 

21 

31 


The floating-point operand in register FRB is con¬ 
verted to a 32-bit signed fixed-point integer, using the 
rounding mode Round toward Zero, and placed in bits 
32:63 of register FRT. Bits 0:31 of register FRT are 
undefined. 

If the operand in FRB is greater than 2 31 — 1, then bits 
32:63 of FRT are set to 0x7FFF_FFFF. If the operand 
in FRB is less than — 2 31 , then bits 32:63 of FRT are 
set to 0x8000_0000. 

The conversion is described fully in Appendix B.2, 
“Floating-Point Convert to Integer Model” on 
page 218. 

Except for enabled Invalid Operation Exceptions, 
FPSCR fprf is undefined. FPSCR fr is set if the result 
is incremented when rounded. FPSCR f , is set if the 
result is inexact. 

Special Registers Altered: 

FPRF (undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CR1 (if Rc = 1) 


Floating Convert From Integer 
Doubleword X-form 


fcfid FRT,FRB (Rc = 0) 

fcfid. FRT.FRB (Rc=1) 


63 

FRT 

III 

FRB 

846 

Rc 

0 

6 

11 

16 

21 

31 


The 64-bit signed fixed-point operand in register FRB 
is converted to an infinitely precise floating-point 
integer. If the result of the conversion is already in 
double-precision range it is placed into register FRT. 
Otherwise the result of the conversion is rounded to 
double-precision using the rounding mode specified 
by FPSCR rn and placed into register FRT. 

The conversion is described fully in Appendix B.3, 
“Floating-Point Convert from Integer Model” on 
page 221. 

FPSCR fprf is set to the class and sign of the result. 
FPSCR fr is set if the result is incremented when 
rounded. FPSCRf is set if the result is inexact. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
the system illegal instruction error handler to be 
invoked. 

Special Registers Altered: 

FPRF FR FI 
FX XX 

CR1 (if Rc = 1) 
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4.6.8 Floating-Point Compare Instructions 


The floating-point Compare instructions compare the 
contents of two floating-point registers. Comparison 
ignores the sign of zero (i.e., regards +0 as equal to 
—0). The comparison can be ordered or unordered. 

The comparison sets one bit in the designated CR 
field to one, and the other three to zero. The FPCC is 
set in the same way. 


The CR field and the FPCC are interpreted as follows: 


Bit 

Name 

Description 

0 

FL 

(FRA) < (FRB) 

1 

FG 

(FRA) > (FRB) 

2 

FE 

(FRA) = (FRB) 

3 

FU 

(FRA) ? (FRB) (unordered) 


Floating Compare Unordered X-form 

fcmpu BF,FRA,FRB 


63 

BF 

// 

FRA 

FRB 

0 

/ 

0 

6 

9 

11 

16 

21 

31 


if (FRA) is a NaN or 

(FRB) is a NaN then c «■ 0b0001 
else if (FRA) < (FRB) then c «- 0bl000 
else if (FRA) > (FRB) then c «- 0b0100 
else c +■ 0b0010 

FPCC <- c 

CR4xBF:4*BF + 3 C 

if (FRA) is an SNaN or 
(FRB) is an SNaN then 
VXSNAN <- 1 

The floating-point operand in register FRA is com¬ 
pared to the floating-point operand in register FRB. 
The result of the compare is placed into CR field BF 
and the FPCC. 

If either of the operands is a NaN, either quiet or sig¬ 
nalling, then CR field BF and the FPCC are set to 
reflect unordered. If either of the operands is a Sig¬ 
nalling NaN, then VXSNAN is set. 

Special Registers Altered: 

CR field BF 

FPCC 

FX 

VXSNAN 


Floating Compare Ordered X-form 


fcmpo BF,FRA,FRB 


63 

BF 

// 

FRA 

FRB 

32 

/ 

0 

6 

9 

ii 

16 

21 

31 


if (FRA) is a NaN or 

(FRB) is a NaN then c «- 0b0Q01 
else if (FRA) < (FRB) then c «- 0bl000 
else if (FRA) > (FRB) then c «- 0b0100 
else c <- 0b0010 

FPCC «• c 

CR 4xBF: 4*BF+3 * c 

if (FRA) is an SNaN or 
(FRB) is an SNaN then 
VXSNAN «- 1 

if VE = 0 then VXVC «- 1 
else if (FRA) is a QNaN or 

(FRB) is a QNaN then VXVC «- 1 

The floating-point operand in register FRA is com¬ 
pared to the floating-point operand in register FRB. 
The result of the compare is placed into CR field BF 
and the FPCC. 

If either of the operands is a NaN, either quiet or sig¬ 
nalling, then CR field BF and the FPCC are set to 
reflect unordered. If either of the operands is a Sig¬ 
nalling NaN, then VXSNAN is set and, if Invalid Opera¬ 
tion is disabled (VE = 0), VXVC is set. If neither 
operand is a Signalling NaN but at least one operand 
is a Quiet NaN, then VXVC is set. 

Special Registers Altered: 

CR field BF 

FPCC 

FX 

VXSNAN VXVC 
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4.6.9 Floating-Point Status and Control Register Instructions 


Every Floating-Point Status and Control Register 
instruction appears to synchronize the effects of all 
floating-point instructions executed by a given 
processor. Executing a Floating-Point Status and 
Control Register instruction ensures that all floating¬ 
point instructions previously initiated by the given 
processor appear to have completed before the 
Floating-Point Status and Control Register instruction 
is initiated, and that no subsequent floating-point 
instructions appear to be initiated by the given 
processor until the Floating-Point Status and Control 
Register instruction has completed. In particular: 

■ all exceptions that will be caused by the previ¬ 
ously initiated instructions are recorded in the 


FPSCR before the Floating-Point Status and 
Control Register instruction is initiated; 

■ all invocations of the system floating-point 
enabled exception error handler that will be 
caused by the previously initiated instructions 
have occurred before the Floating-Point Status 
and Control Register instruction is initiated; and 

■ no subsequent floating-point instruction that 
depends on or alters the settings of any FPSCR 
bits appears to be initiated until the Floating- 
Point Status and Control Register instruction has 
completed. 

(Floating-point Storage Access instructions are not 
affected.) 


Move From FPSCR X-form 


mffs FRT (Rc = 0) 

mffs. FRT (Rc = 1) 


63 

FRT 

III 

III 

583 

Rc 

0 

6 

11 

16 

21 

31 


Move to Condition Register from FPSCR 
X-form 


mcrfs BF.BFA 


63 

BF 

B 

BFA 

// 

III 

64 

/ 

0 

6 

1 

ii 

14 

16 

21 

31 


The contents of the FPSCR is placed into bits 32:63 of 
register FRT. Bits 0:31 of register FRT are undefined. 

Special Registers Altered: 

CR1 (if Rc = 1) 


The contents of FPSCR field BFA are copied to CR 
field BF. All exception bits copied are reset to zero in 
the FPSCR. 


Special Registers Altered: 

CR field BF 
FX OX 

UX ZX XX VXSNAN 

VXISI VXIDI VXZDZ VXIMZ 

VXVC 

VXSOFT VXSORT VXCVI 


(if BFA = 0) 
(if BFA = 1) 
(if BFA = 2) 
(if BFA = 3) 
(if BFA = 5) 
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Move To FPSCR Field Immediate 
X-form 

mtfsfi BF,U 

mtfsfi. BF,U 


(Rc = 0) 
(Rc = 1) 


Move To FPSCR Fields XFL-form 

mtfsf FLM.FRB 

mtfsf. FLM.FRB 


(Rc = 0) 
(Rc-1) 


BF // 


III 


I/I FRB 


16 20 21 


The value of the U field is placed into FPSCR field BF. 


Special Registers Altered: 
FPSCR field BF 
CR1 

I- Programming Note — 


(if Rc = 1) 


The contents of bits 32:63 of register FRB are placed 
into the FPSCR under control of the field mask speci¬ 
fied by FLM. The field mask identifies the 4-bit fields 
affected. Let i be an integer in the range 0-7. If 
FLMj = 1 then FPSCR field i (FPSCR bits 4xi through 
4xi + 3) is set to the contents of the corresponding 
field of the low-order 32 bits of register FRB. 


When FPSCR 0:3 is specified, bits 0 (FX) and 3 (OX) 
are set to the values of U 0 and U 3 (i.e., even if 
this instruction causes OX to change from 0 to 1, 
FX is set from U 0 and not by the usual rule that 
FX is set to 1 when an exception bit changes from 
0 to 1). Bits 1 and 2 (FEX and VX) are set 
according to the usual rule, given on page 84, and 
not from U ,.2- 


Special Registers Altered: 

FPSCR fields selected by mask 
CR1 

I- Programming Note - 


(if Rc = 1) 


Updating fewer than all eight fields of the FPSCR 
may have substantially poorer performance on 
some implementations than updating all the fields. 
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Move To FPSCR Bit 0 X-form 


Move To FPSCR Bit 1 X-form 


mtfsbO BT (Rc = 0) mtfsbl BT (Rc-O) 

mtfsbO. BT (Rc=1) mtfsbl. BT (Rc=1) 


63 

BT 

III 

III 

38 

Rc 

0 

6 

11 

16 

21 

31 


63 

BT 

III 

III 

70 

Rc 

0 

6 

11 

16 

21 

31 


Bit BT of the FPSCR is set to zero. Bit BT of the FPSCR is set to one. 


Special Registers Altered: Special Registers Altered: 

FPSCR bit BT FPSCR bit BT 

CR1 (if Rc = 1) CR1 (if Rc= 1) 




















Part 2. PowerPC Virtual Environment Architecture 


This part defines the additional instructions and facili¬ 
ties, beyond those of the PowerPC User Instruction 
Set Architecture. It covers the storage model and 


related instructions and facilities available to the 
application programmer, and the Time Base as seen 
by the application programmer. 


Chapter 5. Storage Model . 119 

5.1 Definitions and Notation. 119 

5.2 Introduction . 120 

5.3 Single-copy Atomicity . 120 

5.4 Memory Coherence . 120 

5.5 Storage Control Attributes .... 121 

5.6 Cache Models . 122 

5.7 Shared Storage . 125 

5.8 Virtual Storage . 128 

Chapter 6. Effect of Operand 
Placement on Performance . 129 

6.1 Instruction Restart . 130 

6.2 Atomicity and Order . 130 


Chapter 7. Storage Control 

Instructions . 131 

7.1 Parameters Useful to Application 

Programs . 131 

7.2 Cache Management Instructions 132 

7.3 Enforce In-order Execution of I/O 

Instruction . 135 

Chapter 8. Time Base . 137 

8.1 Time Base Instructions . 137 

8.2 Reading the Time Base on 64-bit 

Implementations . 138 

8.3 Reading the Time Base on 32-bit 

Implementations . 138 

8.4 Computing Time of Day from the 

Time Base . 138 


Part 2. PowerPC Virtual Environment Architecture 117 
























Chapter 5. Storage Model 


5.1 Definitions and Notation 

The following definitions, in addition to those specified 
in Book I, are used in this document. 

■ main storage 

The common storage that a processor or other 
mechanism accesses when it has no cache or has 
no copy of the storage being accessed in its 
cache. 

■ sequential execution 

A model for the execution of a sequence of 
instructions (program) in which one instruction is 
executed and completed before the next instruc¬ 
tion is begun. Instructions are executed in the 
order in which they appear in the program, 
except following the execution of a branch 
instruction, which causes sequential execution to 
continue at the location specified by the branch 
instruction. 

■ program order 

The execution of instructions in the strict order in 
which they occur in the program. See sequential 
execution above. 

■ processor 

A hardware component that executes the 
PowerPC instructions specified in a program. 

■ storage location 

One or more sequential bytes of storage begin¬ 
ning at the address computed by a Storage 
Access instruction. The number of bytes com¬ 
prising the location depends on the type of 
Storage Access instruction being executed. 

■ load 

An instruction that copies one or more bytes from 
a storage location to one or more registers (GPRs 
or FPRs). 

■ store 

An instruction that copies one or more bytes from 
one or more registers (GPRs or FPRs) to a 
storage location. 

■ system 

A combination of processors, storage, and associ¬ 
ated mechanisms that is capable of executing 


programs. Sometimes the reference to system 
includes services provided by the operating 
system. 

■ uniprocessor 

A system that contains one PowerPC processor. 

■ multiprocessor 

A system that contains two or more PowerPC 
processors. 

■ shared storage multiprocessor 

A multiprocessor that contains some common 
storage, which all the PowerPC processors in the 
system can access. 

■ performed 

A load is performed with respect to all other 
processors (and mechanisms) when the value to 
be returned by the load can no longer be 
changed by a subsequent store by any processor 
(or other mechanism). 

A store is performed with respect to all other 
processors (and mechanisms) when any load 
from the same location used by the store returns 
the value stored (or a value stored subsequently). 

■ storage page 

The aligned unit of storage that is managed by 
the virtual storage system and that can be 
assigned storage control attributes. 

■ block 

The aligned unit of storage operated on by each 
Cache Management instruction. The size of a 
block can vary by instruction and by implementa¬ 
tion. The maximum block size is one page. 

■ aligned storage access 

A load or store is aligned if the address of the 
target storage location is a multiple of the size of 
the transfer effected by the instruction. 

■ atomic access 

A storage access executed by a processor during 
which no other processor or mechanism can 
access any byte of the target location between 
the time the processor performing the access 
accesses any byte of the location and the time 
that it completes the access to all bytes of that 
location. 
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5.2 Introduction 

The PowerPC User Instruction Set Architecture 
defines storage as a finear array of bytes indexed 
from 0 to a maximum of 2 s4 — 1{2 32 — 1}. Each byte is 
identified by its index, called its address. Each byte 
contains a value. This information is sufficient to 
allow the programming of applications which require 
no special features of any particular system environ¬ 
ment. The PowerPC Virtual Environment Architecture, 
described herein, expands this simple storage model 
to include caches, virtual storage, and shared 
storage multiprocessors. The PowerPC Virtual Envi¬ 
ronment Architecture in conjunction with services 
based on the PowerPC Operating Environment Archi¬ 
tecture and provided by the operating system permit 
explicit control of this expanded storage model. A 
simple model for sequential execution allows at most 
one storage access to be performed at a time, and 
requires that all storage accesses appear to be per¬ 
formed in program order. In contrast to this simple 
model, the PowerPC architecture specifies a relaxed 
model of memory consistency. In a multiprocessor 
system that allows multiple copies of a location, 
aggressive implementations of the architecture can 
permit intervals of time during which different copies 
of a location have different values. This chapter 
describes features of the PowerPC architecture that 
enable programmers to write correct programs for 
this memory model. 


5.3 Single-copy Atomicity 

An access is single-copy atomic, or simply atomic, if it 
is always performed in its entirety with no visible 
fragmentation. Atomic accesses are thus serialized: 
each happens in its entirety in some order, even 
when that order is not specified in the program nor 
enforced between processors. 

In PowerPC the following single-register accesses are 
always atomic: 

■ byte accesses (all bytes are aligned on byte 
boundaries) 

■ halfword accesses aligned on halfword bounda¬ 
ries 

■ word accesses aligned on word boundaries 

■ doubleword accesses aligned on doubleword 
boundaries (64-bit implementations only) 

No other accesses are guaranteed to be atomic. In 
particular, multiple-register loads and stores are not 
atomic, nor are floating-point doubleword accesses on 
a 32-bit implementation. 

The results for several combinations of loads and 
stores to the same or overlapping locations are 
described below. 


1. When two processors execute atomic stores to 
locations that do not overlap and no other stores 
are performed to those locations, the content of 
those locations is the same as if the two stores 
were performed by a single processor. 

2. When two processors execute atomic stores to 
the same storage location, and no other store is 
performed to that location, the content of that 
location is the result stored by one of the 
processors. 

3. When two processors execute stores that have 
the same target location and that are not guaran¬ 
teed to be atomic, and no other store is per¬ 
formed to that location, the result is some 
combination of the bytes stored by both 
processors. 

4. When two processors execute stores to over¬ 
lapped locations, and no other store is performed 
to those locations, the result is some combination 
of the bytes stored by the processors to the over¬ 
lapping bytes. The portions of the locations that 
do not overlap contain the bytes stored by the 
processor storing to the location. 

5. When a processor executes an atomic store to a 
location, a second processor executes an atomic 
load from that location, and no other store is per¬ 
formed to that location, the value returned by the 
load is the content of the location prior to the 
store or the content of the location subsequent to 
the store. 

6. When a load and a store with the same target 
location can be executed simultaneously, and no 
other store is performed to the location, the value 
returned by the load some combination of the 
content of the location before the store and after 
the store. 


5.4 Memory Coherence 

Coherence refers to the ordering of writes to a single 
location. Atomic stores to a given location are 
coherent if they are serialized in some order, and no 
processor is able to observe any subset of those 
stores as occurring in a conflicting order. This serial¬ 
ization order is an abstract sequence of values; the 
physical memory location need not assume each of 
the values written to it. For example, if a processor 
has a store-in cache, it may update a location several 
times before the value is written to the physical 
memory. The result of a store operation is not avail¬ 
able to every processor at the same instant, and it 
may be that a processor observes only some of the 
values that are written to a location. However, when 
a location is accessed atomically and coherently by 
all processors, then, for any processor, the sequence 
of values it loads from the location during any interval 
of time forms a subsequence of the sequence of 
values that the location logically held during that 
interval. That is, a processor can never load a 
“newer” value first and then, later, load an “older” 
value. 
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As noted in Section 5.5, “Storage Control Attributes” 
on page 121, the coherence of storage pages may be 
managed by hardware or software depending on the 
setting of the Memory Coherence attribute. 

Memory coherence is managed in blocks called 
coherence blocks. Their size is implementation- 
dependent (see the Book IV, PowerPC Implementation 
Features document for the implementation), but is 
usually larger than a word and often the size of a 
cache block. 

5.4.1 Coherence Required 

When a processor accesses a page in Memory Coher¬ 
ence Required mode, each store to a location in that 
page must be serialized with all stores to that location 
by all other processors that also access the location 
coherently. This can be implemented, for example, by 
an ownership protocol that allows at most one 
processor at a time to store to the location. 

Coherence does not ensure that the result of a store 
by one processor will be immediately visible to all 
other processors and mechanisms in the system. 
Only after a program has executed the sync instruc¬ 
tion are previous storage accesses it executed guar¬ 
anteed to be globally visible. 

5.4.2 Coherence Not Required 

When an accessed page is in Memory Coherence Not 
Required mode, the processor need not enforce 
storage coherence. This coherence mode may be 
selected by software to improve performance when it 
is known that the particular area of storage the 
processor is accessing will not be accessed by 
another processor or mechanism. In this mode, soft¬ 
ware must ensure that the appropriate Cache Man¬ 
agement instructions have been used to put storage 
in a consistent state prior to changing the mode or 
allowing access to that storage area by a different 
processor or mechanism. 

- Programming Note - 

In a single-cache system, Coherence Required is 
not necessary for correct coherent execution. In 
fact, in such a system, Coherence Not Required 
may give better performance. 


5.5 Storage Control Attributes 

Some operating systems may provide means to allow 
programs to specify storage control attributes not 
described in this document. The definition of these 
attributes can be found in Part 3, “PowerPC Operating 
Environment Architecture” on page 141. The fol¬ 
lowing describes what is expected to be provided 
when the operating system supports these functions. 
The details may vary among operating systems, so 
the details of the specific system being used must be 
known before these functions can be used. 

Generally, the program may use one of each of the 
following pairs of storage attributes: 

■ Write Through Required or Not Required 

■ Caching Inhibited or Allowed 

■ Memory Coherence Required or Not Required 

Not all combinations of these three modes are sup¬ 
ported; see Part 3, “PowerPC Operating Environment 
Architecture” on page 141 for further details. 

A program can specify, through an operating system 
service, the attributes for each page of storage to 
which it has access. Each load or store will be per¬ 
formed in the following manner, depending on the 
setting of the storage control attributes for the page 
of storage containing the addressed storage location. 

Write Through 

This attribute is meaningful only for Caching 
Allowed storage. It provides the program control 
over whether 

■ the processor is required to update the copy of 
the storage location in the cache and in main 
storage, or 

■ the processor is allowed to update the copy of 
the storage location in the cache and to defer 
the update of main storage. 

Required 

Loads use the copy in the cache if it is there. 
Stores update the copy of the storage location 
in the cache if it is in the cache and also 
update the storage location in main storage. 
Not Required 

Loads and stores use the copy in the cache if 
it is there. The block containing the target 
storage location may be copied to the cache. 
The storage location in main storage need not 
contain the value most recently stored to that 
location. 
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Caching 

Inhibited 

When caching is inhibited, the Write Through 
attribute has no meaning. The load or store is 
executed in the following manner: 

1. The operation is performed to main 
storage bypassing the cache (i.e., neither 
the target location nor any of the block(s) 
containing it are copied into the cache). 

2. The operation causes an access 
(load/store) of appropriate length (i.e., 
byte, halfword, word, etc.) to the target 
location in main storage. 

It is considered a programming error if a copy 
of the target location of an access to Caching 
Inhibited storage is in the cache. Software 
must ensure that the location has not previ¬ 
ously been brought into the cache or, if it has, 
that it has been flushed from the cache. If the 
programming error occurs, the result of the 
access is boundedly undefined. 

Allowed 

When caching is allowed, the access is per¬ 
formed in the following manner: 

1. If the block containing the target storage 
location is in the cache, it is used. 

2. If the block containing the target location 
is not in the cache, the block(s) of storage 
containing the target location may be 
copied to the cache and, if the access is a 
store, the target location is updated in the 
cache if it is in the cache. 

Memory Coherence 

This attribute provides the program control over 
whether the processor maintains storage coher¬ 
ence: 

Required 

Stores by all processors to the same location 
are serialized into some order and no 
processor is able to observe any subset of 
those stores as occurring in a conflicting 
order. 

Not Required 

The order in which one processor observes 
the stores performed by one or more other 
processors is undefined. 

When coherence is required, its serialization func¬ 
tion is effective for all supported combinations of 
the Write Through and Caching modes (see Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141 ). 

When coherence is not required, the programmer 
must manage the coherence of storage through use 
of sync and Cache Management instructions, and 
facilities provided by the operating system. 

- Programming Note - 

Software must ensure that all locations in a page 
have been purged from the cache prior to 
changing the storage mode for the page from 
Caching Allowed to Caching Inhibited. 


5.6 Cache Models 

The PowerPC architecture does not require any partic¬ 
ular cache organization and allows many different 
implementations. However, for a program to execute 
correctly on all implementations, the programmer 
should assume that separate instruction and data 
caches exist, and should program to the separate 
cache model. The functions of these caches are 
affected by the storage control attributes associated 
with each storage access as described in 5.5, 
“Storage Control Attributes” on page 121. Cache 
Management instructions are provided so programs 
can manage the caches when needed. Depending on 
the storage control attributes specified by the 
program and the function being performed, the 
program may need to use these instructions to guar¬ 
antee that the function is performed correctly. The 
Cache Management instructions are also useful to 
optimize the use of memory bandwidth in such appli¬ 
cations as graphics and numerically intensive com¬ 
puting. . 

The processor is not required to maintain copies of 
storage locations in the instruction cache consistent 
with changes to storage resulting from the execution 
of store instructions. Program management of the 
cache is required when the program generates or 
modifies code that will be executed (i.e., when the 
program modifies data in storage and then attempts 
to execute the modified data as instructions). 

The instructions provided allow the program to 

■ invalidate the copy of storage in an instruction 
cache block (icbi) 

■ perform context synchronization, as described in 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141 ( isync ) 

■ copy the content of a data cache block to main 
storage (dcbst) 

■ copy the content of a data cache block to main 
storage and make the copy of the block in the 
data cache invalid ( debt) 

■ set the content of a data cache block to zeroes 
(debz) 

■ give a hint that a block of storage should be 
copied into the data cache, so that the copy of 
the block may be in the cache when subsequent 
accesses to the block occur, thereby reducing 
delays (debt, debtst) 

The function of the Cache Management instructions 
depends on the implementation of the caches and on 
the storage control attributes associated with the 
cache block that is the target of the cache instruction. 

There are many variations of cache implementations 
and the following sections do not attempt to describe 
them exhaustively. However, the variations that 
affect the function of the Cache Management 
instructions are discussed here. 
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— Programming Note- 

Implementations will vary as to what instructions 
need be executed to perform a function such as 
code modification. Operating systems are encour¬ 
aged to provide a service (implementation- 
dependent) to perform the function in an efficient 
manner. 


5.6.1 Split or Dual Caches 

A cache model in which there are separate caches for 
instructions and data is called a “Harvard style” 
cache. This style is the standard PowerPC cache 
model; that is, it is the model assumed by this archi¬ 
tecture and the function of the Cache Management 
instructions depends on this model as well as on the 
storage control attributes of the target storage block. 
A copy of a target block in the cache is said to be 
marked invalid if it will not be used for subsequent 
accesses. The following sections describe the func¬ 
tions performed by each of the Cache Management 
instructions in this model. 

5.6.1.1 Instruction Cache Block 
Invalidate 

Invalidating the target block causes any subsequent 
fetch request for an instruction in the block to not find 
the block in the cache and to be sent to storage. The 
instruction performs the following operations: 

1. If the target block is not accessible to the 
program for loads, the system data storage error 
handler may be invoked. 

2. The target block in the instruction cache of the 
executing processor is marked invalid. 

3. If the effective address has an attribute of Coher¬ 
ence Required, the block is invalidated in the 
instruction caches of all other processors in the 
system. 

4. This access need not be recorded, but if if is, it is 
considered a load and not a store. 

5.6.1.2 Data Cache Block Store 

This instruction permits the program to ensure that 
the latest version of the target storage block is in 
main storage. The instruction performs the following 
operations: 

1. If the target block is not accessible to the 
program for loads, the system data storage error 
handler may be invoked. 

2. Memory Coherence 
Required 

If the target block is in any of the data caches 
in the system and has been modified, it is 
copied to main storage. 


Not Required 

If the target block is in the data cache of the 
executing processor and has been modified, it 
is copied to main storage. 

3. This access need not be recorded, but if it is it is 
considered a load and not a store. 

The above action is taken regardless of the setting of 
the other storage control attributes. 

5.6.1.3 Data Cache Block Flush 

This instruction permits the program to ensure that 
the latest version of the target storage block is in 
main storage and no longer in the data cache. The 
instruction performs the same operations as does the 
Data Cache Block Store. In addition to those oper¬ 
ations, the following is done. 

Memory Coherence Required 

If the target block is in any of the data caches in 
the system, it is marked invalid in those data 
caches. 

Memory Coherence Not Required 

If the target block is in the data cache of the exe¬ 
cuting processor, it is marked invalid in that data 
cache. 

These actions are taken regardless of the setting of 
the other storage control attributes. 

5.6.1.4 Data Cache Block set to Zero 

This instruction permits the program to set large 
areas of storage to zeros in an efficient manner. The 
instruction performs the following operations: 

1. If the target block is not accessible to the 
program for stores, the system data storage error 
handler is invoked. 

2. Write Through Required 

Either each byte of the block in main storage is 
set to 0x00, or the system alignment error 
handler is invoked. 

3. Caching Inhibited 

Either each byte of the block in main storage is 
set to 0x00, or the system alignment error 
handler is invoked. 

4. Memory Coherence 

■ Required 

— If the target block is in the data cache of 
the executing processor, each byte in the 
block is set to 0x00 and all copies of the 
block in all data caches are made con¬ 
sistent. 

— If the target block is not in the data 
cache of the executing processor, the 
block is established in the data cache 
without fetching it from storage and each 
byte in the block is set to 0x00. All 
copies of the block in all data caches are 
made consistent. 

■ Not Required 
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— If the target block is in the data cache of 
the executing processor, each byte in the 
block is set to 0x00. 

— If the target block is not in the data 
cache of the executing processor, the 
block is established in the data cache 
without fetching it from storage and each 
byte in the block is set to 0x00. 

5. This access must be recorded. It is considered a 
store to the target location. 

5.6.1.5 Data Cache Block Touch 

The two Touch instructions (one for reading, the other 
for writing) provide a mechanism by which a program 
may avoid some of the delays due to accessing 
storage by attempting to have the target storage 
location in the cache prior to its first use. These 
instructions are performance hints and operate as 
follows: 

1. If the target block is not accessible to the 
program for loads, no other operation is per¬ 
formed. 

2. Caching Inhibited 

The block is not copied into the cache and no 
other operations are performed. 

3. Caching Allowed 

■ Memory Coherence Required 

If the block is not in the cache, the most 
recent version of the block may be copied 
into the cache. 

■ Memory Coherence Not Required 

If the block is not in the cache, the block may 
be copied into the cache from main storage 
without regard for the location of the most 
recently modified version. 

4. This access need not be recorded, but if it is it is 
considered a load and not a store. 

If the instruction is Touch for Store and the block is 
copied into the cache, it is copied in a manner such 
that a subsequent store to the block will execute effi¬ 
ciently. 

The execution of either of these instructions never 
causes the system data error handler to be invoked. 

5.6.2 Combined Cache 

A combined cache implementation provides a single 
cache for instructions and data. For this implementa¬ 
tion, the Instruction Cache Block Invalidate instruction 
need not perform the same operations as it would for 
an implementation with separate caches. It can be 
treated as a no-op, but it is acceptable to invalidate 
the instruction caches of other processors if the 
addressed storage is in Coherence Required mode. 


Following are recommended and required functions of 
this instruction for combined cache implementations. 

Prohibited Operations 

It must not invalidate a block in the combined 
cache that has been modified. The access must not 
be treated as a store. 

Unnecessary Operations 

The access should not be treated as a load or 
store, but to treat it as a load is not a violation of 
the architecture. 

Suggested Operations 

If the program executing icbi does not have access 
to the target block for loads, the system data 
storage error handler should be invoked. 

5.6.3 Write Through Data Cache 

The Cache Management instructions affected by the 
write through implementation of the data cache are 
listed in this section. These instructions must perform 
all the operations specified for a Harvard style cache 
except as specified in this section. Some of the differ¬ 
ences depend on whether the write through imple¬ 
mentation is a write through to main storage or just a 
write through to a second level of cache. 

5.6.3.1 Write Through to Main Storage 

1. Data Cache Block set to Zero 

The processor may invoke the system alignment 
error handler regardless of the setting of the 
storage control attributes. 

2. Data Cache Block Store 

By definition, the cache cannot contain a modified 
block. The processor is not required to copy the 
target block to main storage. 

3. Data Cache Block Flush 

By definition, the cache cannot contain a modified 
block. The processor is not required to copy the 
target block to main storage. 

5.6.3.2 Write Through to Multi-Level 
Cache 

For Data Cache Block set to Zero, the processor may 
invoke the system alignment error handler regardless 
of the setting of the storage control attributes. 

If a cache is the interface to main storage for all 
processors and other mechanisms that access 
storage, that cache can be considered main storage 
with respect to the Cache Management instructions. 
Otherwise, the cache instructions that cause the 
content of a cache block to be copied back to main 
storage or to be marked invalid must be performed 
against all levels of the cache. 
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5.7 Shared Storage 

This architecture supports the sharing of storage 
between programs, between different instances of the 
same program on systems with one or more 
processors, and between processors and other mech¬ 
anisms. It also supports access to a storage location 
by one or more programs using different effective 
addresses. All these cases are considered storage 
sharing. Storage is shared in blocks that are an inte¬ 
gral number of pages. 

When the same storage location has different effec¬ 
tive addresses, the addresses are said to be 
“aliases.” Each application can be granted separate 
access privileges to aliased pages. 

5.7.1 Storage Access Ordering 

The PowerPC architecture specifies a weakly con¬ 
sistent storage model for shared storage multi¬ 
processor systems. This model provides an 
opportunity for significantly improved performance 
over the strongly consistent model, but places the 
responsibility on the program to ensure that ordering 
or synchronization instructions are properly placed 
when necessary for the correct execution of the 
program. 

In this architecture, the order in which the processor 
performs storage accesses, the order in which those 
accesses complete in main storage, and the order in 
which those accesses are viewed as occurring by 
another processor may all be different. This property 
is referred to storage access ordering. A means of 
enforcing an ordering of storage accesses is provided 
to allow programs or instances of programs to share 
storage. Similar means are needed to allow pro¬ 
grams executing on a processor to share storage with 
some other mechanism, such as an I/O device, that 
can also access storage. 

The purpose of specifying a weakly consistent storage 
model is to allow the processor to run very fast for 
most storage accesses. Two instructions, Enforce in- 
order Execution of I/O and Synchronize , are provided 
that enable the program to control the order in which 
storage accesses are performed by separate 
instructions. No ordering should be assumed for the 
storage accesses done by a multiple-register load or 
store instruction, and no means are provided for con¬ 
trolling that order. 


5.7.1.1 The Enforce In-order Execution 
of I/O Instruction 

The e/'e/'o instruction permits the program to control 
the order in which loads and stores are performed in 
main storage when the accessed storage is both 
Caching Inhibited and Guarded, and the order in 
which stores are performed in main storage when the 
accessed storage is Write Through Required. It does 
not affect the order of other data accesses, nor of 
cache operations (whether caused explicitly by exe¬ 
cution of a Cache Management instruction, or implic¬ 
itly by the cache coherence mechanism). See Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141, for the definition of Guarded storage. 

e/e/o ensures that all applicable data accesses to 
main storage previously initiated by the processor 
have completed with respect to main storage before 
any applicable storage accesses subsequently initi¬ 
ated by the processor access main storage. It acts 
like a barrier that flows through the storage queues 
and to main storage, preventing the reordering of 
storage accesses across the barrier. The e/e/o 
instruction may complete before previously initiated 
storage accesses have been performed with respect 
to other processors and mechanisms. 

e/e/o can be used, for example, to ensure that the 
data from a sequence of stores to the control regis¬ 
ters of an I/O device update those control registers in 
the order specified by the stores as ordered by e/e/o. 

If stronger ordering is desired or if it is necessary to 
order accesses to storage that may be in the cache, 
the sync instruction must be used. 

5.7.1.2 The Synchronize Instruction 

When a portion of storage must be forced to a known 
state, it is necessary to synchronize storage with 
respect to all processors. This is accomplished by 
requiring programs to indicate explicitly in the instruc¬ 
tion stream that synchronization is required, by 
inserting a sync instruction. Only when sync com¬ 
pletes are the effects of all storage accesses previ¬ 
ously executed by the program guaranteed to have 
been performed with respect to ail other processors 
and mechanisms. 

The sync instruction permits the program to ensure 
that all storage accesses it has initiated have been 
performed with respect to all other processors and 
mechanisms before its next instruction is executed. A 
program can use this instruction to ensure that all 
updates to a shared data structure are visible to all 
other processors prior to executing a store that will 
release the lock on that data structure. Execution of 
this instruction does the following: 
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■ Performs the functions described for the sync 
instruction in Part 1, “PowerPC User Instruction 
Set Architecture” on page 1. 

■ Ensures that consistency operations and the 
effects of icbi, dcbz, dcbst, debt, and debi 
instructions (see Part 3, “PowerPC Operating 
Environment Architecture” on page 141) previ¬ 
ously executed by the processor executing the 
sync have completed on all other processors. 

■ Ensures that TLB invalidates executed by the 
processor executing the sync have completed on 
that processor, sync does not wait for such inval¬ 
idates to complete on other processors (see the 
Book III section entitled “Table Update Synchroni¬ 
zation Requirements”). 

■ Ensures that Reference and Change bits in the 
Page Table (see Part 3, “PowerPC Operating 
Environment Architecture” on page 141) are up- 
to-date. 

The sync instruction is execution synchronizing (see 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141). It is not context synchro¬ 
nizing (see Book III), and therefore need not discard 
prefetched instructions. 

For storage that is maintained as Memory Coherence 
Not Required, the only effect of sync on storage oper¬ 
ations is to ensure that all previous storage accesses 
have completed to the level of storage specified by 
the Caching and Write Through storage control attri¬ 
butes (including the updating of Reference and 
Change bits). 

5.7.2 Atomic Update Primitives 

The Load And Reserve and Store Conditional 
instructions together permit atomic update of a 
storage location. 64-bit implementations have word 
and doubleword forms of each of these instructions. 
Described here is the operation of the word forms 
(Iwarx and stwcx.y, operation of the doubleword forms 
(Idarx and s tdex.) is the same except for obvious sub¬ 
stitutions. 

These instructions function in Caching Inhibited, as 
well as in Caching Allowed, storage. The addressed 
page must, however, have the Memory Coherence 
Required attribute for every processor other than the 
one doing the atomic update that might execute a 
store to the location being atomically updated. The 
remainder of this section assumes that if the system 
is a multiprocessor, then all processors have the 
addressed page in Memory Coherence Required 
mode. 

If the addressed storage is in Write Through Required 
mode, it is implementation-dependent whether these 
instructions function correctly or cause the system 
data storage error handler to be invoked. 


The Iwarx is a load from a word-aligned location that 
has two side effects. 

1. A nonspecific reservation for a subsequent stwex. 
or stdex. is created. 

2. The storage coherence mechanism is notified that 
a reservation exists for the real address (see 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141) corresponding to the 
storage location accessed by the Iwarx. 

The stwex. is a store to a word-aligned location that is 
conditioned on the existence of the reservation 
created by the Iwarx or Idarx. To emulate an atomic 
operation with these instructions, it is necessary that 
both the Iwarx and the stwex. access the same 
storage location even though this requirement is not 
enforced by the hardware. Iwarx and stwex. are 
ordered by a dependence on the reservation, and the 
program is not required to insert other instructions to 
maintain the order of storage accesses by these two 
instructions. 

A stwex. performs a store to the target storage 
location only if the storage location accessed by the 
Iwarx that established the reservation has not been 
stored into by another processor or mechanism 
between supplying a value for the Iwarx and storing 
the value supplied by the stwex.. In this case, CRO is 
set to indicate that the store was performed. 

If the stwex. completes but does not perform the 
store because a reservation no longer exists, CRO is 
set to indicate that the stwex. completed but storage 
was not altered. 

Examples of the use of Iwarx and stwex. are given in 
the “Programming Examples” appendix of Part 1, 
“PowerPC User Instruction Set Architecture” on 
page 1. 

When stwex. to a given location succeeds, its store 
has been performed but may not yet be globally 
visible. As a result, a subsequent load or Iwarx from 
the given location on another processor may return a 
“stale” value. However, a subsequent Iwarx from the 
given location on the other processor followed by a 
successful stwex. on that processor is guaranteed to 
have returned the value stored by the first process¬ 
or's stwex. (in the absence of other stores to the 
given location). 


- Programming Note - 

To ensure that a store or stwex. to a given 
location has become globally visible, it must be 
followed by a sync. A subsequent load or Iwarx 
from the given location by another processor will 
then return a value at least as recent as the value 
stored. This is often more synchronization than is 
actually needed to ensure program correctness. 
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5.7.2.1 Reservations 

The ability to emulate an atomic operation using 
Iwarx and stwcx. is based on the conditional behavior 
of stwcx., the reservation set by Iwarx, and the 
clearing of that reservation if the target location is 
modified by another processor or other mechanism 
before the stwcx. performs its store. 


A processor has at most one reservation at any time. 
A reservation is established by executing a Iwarx 
instruction and is lost if any of the following occur: 

■ The processor holding the reservation executes 
another Iwarx or Idarx] this clears the first reser¬ 
vation and establishes a new one. 

■ The processor holding the reservation executes 
any stwcx. or stdcx., whether or not its address 
matches that of the Iwarx. 

■ Some other processor executes a store or dcbz to 
the same reservation granule. 

■ Some other mechanism modifies a storage 
location in the same reservation granule. 

■ Any additional causes of reservation loss are 
described in Book IV, PowerPC Implementation 
Features, for the implementation. 

Interrupts (see Part 3, “PowerPC Operating Environ¬ 
ment Architecture” on page 141) do not clear reser¬ 
vations (however, system software invoked by 
interrupts may clear reservations). Immunity to 
random reservation loss ensures that programs using 
Iwarx and stwcx. can make forward progress. 


— Programming Note - 

Programming convention must ensure that Iwarx 
and stwcx. addresses match. In proper use, a 
stwcx. should be paired with a specific Iwarx to 
the same real address. Situations in which a 
stwcx. may erroneously be issued after some 
Iwarx other than that with which it is intended to 
be paired must be scrupulously avoided. For 
example, there must not be a context change in 
which the old context leaves a Iwarx dangling and 
the new context resumes after a Iwarx and before 
the paired stwcx.. The stwcx. would be success¬ 
fully completed, which is not what was intended 
by the programmer. 

Such a situation must be prevented by issuing a 
stwcx. to a dummy writable word-aligned location 
as part of the context switch, thereby clearing the 
reservation of the dangling Iwarx. Executing 
stwcx. to a word-aligned location suffices to clear 
the reservation, whether it was obtained by Iwarx 
or Idarx. 


5.7.2.2 Guaranteeing Forward Progress 

Forward progress in loops that use Iwarx and stwcx. 
is guaranteed by a cooperative effort between hard¬ 
ware, operating system software, and application soft¬ 
ware. Hardware guarantees that: 

■ one stwcx. among a set of processors holding 
reservations to the same real address will 
succeed, and 

■ reservations are not lost unnecessarily, i.e. when 
the reserved location has not been modified. 

While no general rules can be given regarding oper¬ 
ating system guarantees, programs that use the 
examples in the Programming Examples appendix of 
Part 1, “PowerPC User Instruction Set Architecture” 
on page 1 are guaranteed forward progress. 

5.7.2.3 Reservation Loss Due to 
Granularity 

When one processor holds a reservation, and another 
processor performs a store that might clear that res¬ 
ervation, the address comparison is done in a way 
that ignores an implementation-dependent number of 
low-order bits of the real addresses. The storage 
block corresponding to the ignored low-order bits is 
called the reservation granule. Its size is 
implementation-dependent (see the Book IV, PowerPC 
Implementation Features document for the implemen¬ 
tation), but is a multiple of the coherence block size. 

Lock variables should be allocated such that con¬ 
tention for the locks and updates to nearby data 
structures do not cause excessive reservation losses 


— Programming Note - 

The combination of Iwarx and stwcx. improves 
upon comparejandjswap in that the reservation 
binds the Iwarx and stwcx. together more reliably. 
Compare_and_swap can only check that the old 
and current values of the variable are equal, and 
can cause the program to err if the variable has 
been modified and the old value subsequently 
restored. The reservation is always lost if the 
variable is modified by another processor or 
mechanism between the Iwarx and stwcx., so the 
stwcx. never succeeds unless the variable has not 
been stored into (by another processor or mech¬ 
anism) since the Iwarx. 
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due to false indications of sharing that can occur due 
to the reservation granularity. 

A processor holding a reservation on any word in a 
reservation granule will lose its reservation if some 
other processor stores anywhere in that granule. 
Such problems can be avoided only by ensuring that 
few such stores occur. This can most easily be 
accomplished by allocating an entire granule for a 
lock and wasting all but one word. 

Reservation granularity may vary for each implemen¬ 
tation. There are no architectural restrictions 
bounding the granularity implementations must 
support, so reasonably portable code must dynam¬ 
ically allocate aligned and padded storage for locks to 
guarantee absence of granularity-induced reservation 
loss. 


5.8 Virtual Storage 

The PowerPC system implements a virtual storage 
model for applications. This means that a combina¬ 
tion of hardware and software can present a storage 
model which allows applications to exist within a 
“virtual” address space larger than either the effec¬ 
tive address space or the real address space. 

Each program can access 2 s4 {2 32 } bytes of “effective 
address” (EA) space, subject to limitations imposed 


by the operating system. In a typical PowerPC 
system, each program's EA space is a subset of a 
larger “virtual address” (VA) space managed by the 
operating system. 

The operating system is responsible for managing the 
real (physical) storage resources of the system by 
means of a “storage mapping” mechanism. Storage 
is always allocated and managed in units of “pages,” 
which have a fixed, implementation-dependent size. 
The storage mapping process translates accesses to 
pages in the EA space into accesses to real pages in 
main storage. 

In general, main storage may not be large enough to 
contain all of the virtual pages used by the currently 
active applications. With support provided by hard¬ 
ware mechanisms, the operating system can attempt 
to use the available real pages to map a sufficient set 
of effective address pages of the applications. If a 
sufficient set is maintained, “paging” activity is mini¬ 
mized. If not, performance degradation is likely to 
occur. 

The operating system can support restricted access to 
pages (including read-write, read-only, and no access: 
see Part 3, “PowerPC Operating Environment 
Architecture” on page 141), based on system stand¬ 
ards (e.g., program code might be read-only) and 
application requests. 
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Chapter 6. Effect of Operand Placement on Performance 


The placement (location and alignment) of operands 
in storage affects relative performance of storage 
accesses, and in some cases affects it significantly. 
The best performance is guaranteed if storage oper¬ 
ands are aligned. In order to obtain the best perform¬ 
ance across the widest range of implementations, the 
programmer should assume the performance model 
described in Figures 35 and 36 with respect to the 
placement of storage operands. Figure 35 applies 
when the processor is in Big-Endian mode, and Figure 
36 applies when the processor is in Little-Endian 
mode. Performance of accesses varies depending on 
the following: 

1. Operand Size 

2. Operand Alignment 

3. Endian mode (Big-Endian or Little-Endian) 

4. Crossing no boundary 

5. Crossing a Cache Block Boundary 

6. Crossing a Page Boundary that is also a pro¬ 
tection boundary (see Part 3, “PowerPC Oper¬ 
ating Environment Architecture” on page 141, 
“Storage Protection”). 

7. Crossing a BAT Boundary 

See Book III for a description of BAT. 

8. Crossing a Segment Boundary 

See Book III for a description of storage seg¬ 
ments. 

The Load and Store Multiple instructions are defined 
to operate only on aligned operands. The Move 
Assist instructions have no alignment requirements. 
Both of these sets of instructions are supported only 
in Big-Endian mode. 

For the purposes of Figures 35 and 36, crossing pages 
with different storage control attributes is equivalent 
to crossing a segment boundary. 


Operand 

Boundary Crossing 


Byte 


Cache 


BAT/ 

Size 

Align. 

None 

Block 

Page 

Seg. 

Integer 


8 Byte 

8 

optimal 

— 

— 

— 


4 

good 

good 

poor 

poor 


<4 

poor 

poor 

poor 

poor 

4 Byte 

4 

optimal 

— 

— 

— 


<4 

good 

good 

poor 

poor 

2 Byte 

2 

optimal 

— 

— 

— 


<2 

good 

good 

poor 

poor 

1 Byte 

1 

optimal 

- 

- 

- 

imw, 

4 

good 

good 

good 

poor 

stmw 






string 


good 

good 

poor 

poor 

Float 


8 Byte 

8 

optimal 

— 

— 

— 
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good 

good 

poor 

poor 


<4 

poor 

poor 

poor 

poor 

4 Byte 

4 

optimal 

— 

— 

— 


<4 

poor 

poor 

poor 

poor 


Figure 35. Performance Effects of Storage Operand 
Placement, Big-Endian mode 
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Operand 

Boundary Crossing 


Byte 
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Size 

Align. 
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Block 

Page 
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Integer 


8 Byte 

8 

optimal 

— 

— 

- 


<8 

poor 

poor 

poor 

poor 

4 Byte 

EM 

optimal 

— 

— 

- 


EH 

poor 

poor 

poor 

poor 

2 Byte 

2 
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— 

— 

- 


<2 

poor 

poor 

poor 

poor 

1 Byte 

1 
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- 

- 

- 

Float 


8 Byte 

8 
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— 

— 


<8 

poor 

poor 

poor 

poor 

4 Byte 

OH 

optimal 

— 

— 

- 


EH 

poor 

poor 

poor 

poor 


Figure 36. Performance Effects of Storage Operand 
Placement, Little-Endian mode 


6.1 Instruction Restart 

If a storage access crosses a page boundary that is 
also a protection boundary, a BAT boundary, or a 
segment boundary, a number of conditions could 
cause the execution of the instruction to be aborted 
after part of the access has been performed. For 
example, this may occur when a program attempts to 
access a page it has not previously accessed, or 
when the processor must check for a possible change 
in storage control attributes when an access crosses 
a page boundary. When this occurs, the implementa¬ 
tion or the operating system may restart the instruc¬ 
tion. If the instruction is restarted, some bytes of the 
location may be loaded from or stored to the target 
location a second time. 

The following rules apply to storage accesses with 
regard to restarting the instruction. 

Aligned Accesses 

A single-register instruction which accesses an 
aligned operand is never restarted. 


Unaligned Accesses 

A single-register instruction which accesses an 
unaligned operand may be restarted if the access 
crosses a page, BAT, or segment boundary. 

Load and Store Multiple, Move Assist 

These instructions may be restarted if, in 
accessing the locations specified by the instruc¬ 
tion, a page, BAT, or segment boundary is 
crossed. 


6.2 Atomicity and Order 

Access Atomicity 

With the exception of double-precision floating-point 
operands in 32-bit implementations, all aligned 
accesses are atomic. No other access is required to 
be atomic. Instructions causing multiple accesses 
(Load and Store Multiple and Move Assist) are not 
atomic. 

Access Order 

Since the ordering of storage accesses is not guaran¬ 
teed unless the programmer inserts the appropriate 
ordering instructions, the order of accesses generated 
by a single instruction is not guaranteed. Unaligned 
accesses, Load and Store Multiple instructions, and 
Move Assist instructions have no implicit ordering 
characteristics. For example, processor A may store 
a word operand on an odd halfword boundary. It may 
appear to processor A that the store completed atom¬ 
ically. Processor or other mechanism B, executing a 
load from the same location, may get a result that is 
a combination of the value of the first halfword that 
existed prior to the store by processor A and the 
value of the second halfword stored by processor A. 


- Programming Note - 

The programmer should assume that any una¬ 
ligned access in an ordinary storage segment 
might be restarted. Software can ensure this 
does not occur by use of direct-store segments or 
BAT areas, neither of which have page bounda¬ 
ries (see Part 3, “PowerPC Operating Environment 
Architecture” on page 141). 

Unsynchronized TLB invalidates do not have a 
defined result. 
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Chapter 7. Storage Control Instructions 


The instructions in this chapter are not privileged. 
For most of them, if the applicable cache is not 
present the operation is a “no-op” and has no effect 
on any register or on storage. The only exception is 
the dcbz instruction. When the data cache does not 
exist, dcbz either zeros a certain number of bytes of 
storage (which has an effect similar to zeroing bytes 
in a cache block which are later written to storage) or 
invokes the system alignment error handler (so that 
its function can be simulated). 

As with other storage instructions, the effect of the 
Cache Management instructions on storage is weakly 
consistent. If the programmer needs to ensure that 
Cache Management or other instructions have been 
performed with respect to all other processors and 
mechanisms, a sync instruction must be placed in the 
program following those instructions. 

The description of many of the Cache Management 
instructions has a statement that defines its storage 
semantics, such as “This instruction is treated as a 
store to the addressed byte with respect to address 
translation and protection.” This statement defines 
the operation of the instruction with respect to how it 
affects the page Reference and Change bits, and 
whether or not interrupts occur for a translation error 
or a protection violation (see Part 3, “PowerPC Oper¬ 
ating Environment Architecture” on page 141). 


7.1 Parameters Useful to 
Application Programs 

It is suggested that the operating system provide a 
service that allows an application program to obtain 
the following information. 

1. Page size 

2. Coherence block size 

3. Granule size for reservations 

4. An indicator of whether the processor has (a) a 
combined cache or no caches, or (b) some other 
cache configuration (split caches or one cache 
only; if instruction cache fetches pass through the 
data cache, the cache is considered to be a split 
cache) 

5. Instruction cache size 

6. Data cache size 

7. Instruction cache line size (see Book IV, PowerPC 
Implementation Features) 

8. Data cache line size (see Book IV) 

9. Block size for icbi (if no instruction cache, number 
of bytes zeroed by dcbz) 

10. Block size for debt and debtst (if no data cache, 
number of bytes zeroed by dcbz) 

11. Block size for dcbz, debst, debt, and debi (see 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141 for a description of 
debi) (if no data cache, number of bytes zeroed 
by dcbz) 

12. Instruction cache associativity 

13. Data cache associativity 

14. Factors for converting the Time Base to seconds 

If the caches are combined, the same value should be 
given for an l-cache attribute and the corresponding 
D-cache attribute. 
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7.2 Cache Management Instructions 

7.2.1 Instruction Cache Instructions 


Instruction caches, if they exist, are not required to be 
consistent with data caches, storage, nor I/O data 
transfers. Software must use the appropriate Cache 
Management instructions to ensure that instruction 
caches are kept consistent when instructions are 
modified by the processor or by input data transfer. 
When a processor alters a storage location that may 
be contained in an instruction cache, software must 
ensure that updates to storage are visible to the 
instruction fetching mechanism. Although the 
instructions to accomplish this vary among implemen¬ 
tations and hence many operating systems will 
provide a system service for this function, the fol¬ 
lowing sequence is typical. 

Instruction Cache Block Invalidate X-form 

icbi RA,RB 


31 

III 

RA 

RB 

982 

/ 

0 

6 


16 

21 

31 


Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If the block containing the byte addressed by EA is in 
Coherence Required mode, and a block containing the 
byte addressed by EA is in the instruction cache of 
any processor, the block is made invalid in all such 
processors, so that subsequent references cause the 
block to be refetched. 

If the block containing the byte addressed by EA is in 
Coherence Not Required mode, and a block containing 
the byte addressed by EA is in the instruction cache 
of this processor, the block is made invalid in this 
processor, so that subsequent references cause the 
block to be fetched from main storage (or perhaps 
from a data cache). 

It is acceptable to treat this instruction as a load from 
the addressed byte with respect to address trans¬ 
lation and protection. Implementations with a com¬ 
bined data and instruction cache may treat the icbi 
instruction as a no-op, even to the extent of not vali¬ 
dating the EA. 

If the EA references storage outside of main storage 
(see Direct-Store Segments in Part 3, “PowerPC 
Operating Environment Architecture" on page 141), 
the instruction is treated as a no-op. 

Special Registers Altered: 

None 


1. dcbst - update storage 

2. sync - wait for update (see Part 1, “PowerPC 
User Instruction Set Architecture” on page 1) 

3. icbi - invalidate copy in instruction cache 

4. isync - perform context synchronization (see 
Part 3, “PowerPC Operating Environment 
Architecture” on page 141) 

These operations are necessary because the storage 
may be in Write Through Not Required mode. Since 
instruction fetching may bypass the data cache, 
changes made to items in the data cache may not be 
reflected in storage until after the instruction fetch 
completes. 


Instruction Synchronize XL-form 

isync 

[Power mnemonic: ics] 


19 

III 

III 

III 

150 

/ 

0 

6 

11 

16 

21 

31 


This instruction waits for all previous instructions to 
complete and then discards any prefetched 
instructions, causing subsequent instructions to be 
fetched (or refetched) from storage and to execute in 
the context established by the previous instructions. 
This instruction has no effect on other processors or 
on their caches. 

This instruction is context synchronizing (see Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141). 

Special Registers Altered: 

None 
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7.2.2 Data Cache Instructions 


Data caches and combined caches, if they exist, are 
required to be consistent with other data caches, 
combined caches, storage, and I/O data transfers. 
However, to ensure consistency, aliased effective 
addresses (two effective addresses that map to the 


same real address) must have the same page offset 
(see Section 5.7, “Shared Storage” on page 125). 

If the effective address references storage outside of 
main storage (see Direct-Store Segments in Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141), the instruction is treated as a no-op. 


Data Cache Block Touch X-form 


debt RA,RB 


31 

III 

RA 

RB 

278 

/ 

0 

6 


16 

21 

31 


Let the effective address (EA) be the sum 
(RA|0) + (RB). 

This instruction is a hint that performance will prob¬ 
ably be improved if the block containing the byte 
addressed by EA is fetched into the data cache, 
because the program will probably soon load from the 
addressed byte. Executing debt will not cause the 
system error handler to be invoked. 

It is acceptable to treat this instruction as a load from 
the addressed byte with respect to address trans¬ 
lation and protection, except that the system error 
handler must not be invoked for a translation or pro¬ 
tection violation. 

Special Registers Altered: 

None 

- Programming Note - 

The purpose of this instruction is to allow the 
program to request a cache block fetch before it 
is actually needed by the program. The program 
can later perform loads to put data into registers. 
However, the processor is not obliged to load the 
addressed block into the data cache. 


Data Cache Block Touch for Store X-form 


debtst RA,RB 


31 

III 

RA 

RB 

246 

/ 

0 

6 

ii 

16 

21 

31 


Let the effective address (EA) be the sum 
(RA|0) + (RB). 

This instruction is a hint that performance will prob¬ 
ably be improved if the block containing the byte 
addressed by EA is fetched into the data cache, 
because the program will probably soon store into the 
addressed byte. Executing debtst will not cause the 
system error handler to be invoked. 

It is acceptable to treat this instruction as a load from 
the addressed byte with respect to address trans¬ 
lation and protection, except that the system error 
handler must not be invoked for a translation or pro¬ 
tection violation. Since debtst does not modify 
storage, it must not be recorded as a store. 

Special Registers Altered: 

None 

- Programming Note - 

The purpose of this instruction is to allow the 
program to request a cache block fetch before it 
is actually needed by the program. The program 
can later perform stores to put data into storage. 
However, the processor is not obliged to load the 
addressed block into the data cache. 
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Data Cache Block set to Zero X-form 


dcbz RA,RB 

[Power mnemonic: dclz] 


31 

III 

RA 

RB 

1014 

/ 

0 

6 

ii 

16 

21 

31 


Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If the block containing the byte addressed by EA is in 
the data cache, all bytes of the block are set to zero. 

If the block containing the byte addressed by EA is 
not in the data cache and the corresponding page is 
Caching Allowed, the block is established in the data 
cache without fetching the block from main storage, 
and all bytes of the block are set to zero. 

If the page containing the byte addressed by EA is 
Caching Inhibited or Write Through Required, then 
either (a) all bytes of the area of main storage that 
corresponds to the addressed block are set to zero, 
or (b) the system alignment error handler is invoked. 

If the block containing the byte addressed by EA is in 
Coherence Required mode, and the block exists in the 
data cache(s) of any other processor(s), it is kept 
coherent in those caches. 

This instruction is treated as a store to the addressed 
byte with respect to address translation and pro¬ 
tection. 

Special Registers Altered: 

None 

- Programming Note - 

If the page containing the byte addressed by EA is 
Caching Inhibited or Write Through Required, the 
system alignment error handler should set to zero 
all bytes of the area of main storage that corre¬ 
sponds to the addressed block. 

See the Interrupt chapter of Part 3, “PowerPC 
Operating Environment Architecture” on page 141 
for discussion of a possible delayed Machine 
Check interrupt that can be caused by dcbz if the 
operating system has set up an incorrect storage 
mapping. 


Data Cache Block Store X-form 


dcbst RA,RB 


31 

III 

RA 

RB 
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ii 

16 

21 

31 


Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If the block containing the byte addressed by EA is in 
Coherence Required mode, and a block containing the 
byte addressed by EA is in the data cache of any 
processor and has been modified, the writing of it to 
main storage is initiated. 

If the block containing the byte addressed by EA is in 
Coherence Not Required mode, and a block containing 
the byte addressed by EA is in the data cache of this 
processor and has been modified, the writing of it to 
main storage is initiated. 

The function of this instruction is independent of the 
Write Through Required/Not Required and Caching 
Inhibited/Allowed modes of the block containing the 
byte addressed by EA. 

It is acceptable to treat this instruction as a load from 
the addressed byte with respect to address trans¬ 
lation and protection. 

Special Registers Altered: 

None 
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Data Cache Block Flush X-form 


dcbf RA,RB 
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Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The action taken depends on the storage mode asso¬ 
ciated with the target and on the state of the block. 
The list below describes the action taken for the 
various cases. The actions described must be exe¬ 
cuted regardless of whether the page containing the 
addressed byte is in Caching Inhibited or Caching 
Allowed mode. 

1. Coherence Required 
Unmodified Block 

Invalidate copies of the block in the caches of 
all processors. 

Modified Block 

Copy the block to storage. Invalidate copies of 
the block in the caches of all processors. 

Absent Block 

If modified copies of the block are in the 
caches of other processors, cause them to be 
copied to storage and invalidated. If unmodi¬ 
fied copies are in the caches of other 
processors, cause those copies to be invali¬ 
dated. 

2. Coherence Not Required 
Unmodified Block 

Invalidate the block in the processor's cache. 
Modified Block 

Copy the block to storage. Invalidate the block 
in the processor's cache. 

Absent Block 
Do nothing. 

The function of this instruction is independent of the 
Write Through Required/Not Required and Caching 
Inhibited/Allowed modes of the block containing the 
byte addressed by EA. 

It is acceptable to treat this instruction as a load from 
the addressed byte with respect to address trans¬ 
lation and protection. 

Special Registers Altered: 

None 


7.3 Enforce In-order Execution 
of I/O Instruction 

Enforce In-order Execution of HO 
X-form 


eieio 
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The eieio instruction provides an ordering function for 
the effects of loads and stores executed by a 
processor. Executing an eieio instruction ensures that 
all applicable loads and stores previously initiated by 
the processor are complete with respect to main 
storage before any applicable loads and stores subse¬ 
quently initiated by the processor access main 
storage. 

eieio orders loads and stores to storage that is both 
Caching Inhibited and Guarded (see Part 3, “PowerPC 
Operating Environment Architecture” on page 141), 
and stores to storage that is Write Through Required. 
It does not affect the order of other data accesses, 
nor of cache operations (whether caused explicitly by 
execution of a Cache Management instruction, or 
implicitly by the cache coherence mechanism). 

Special Registers Altered: 

None 


— Programming Note - 

The eieio instruction is intended for use in doing 
memory-mapped I/O (see Part 3, “PowerPC Oper¬ 
ating Environment Architecture” on page 141) 
and in preventing load/store combining operations 
in main storage. It can be thought of as placing a 
barrier into the stream of storage accesses issued 
by a processor, such that any given storage 
access appears to be on the same side of the 
barrier to both the processor and the I/O device. 

The eieio instruction may complete before previ¬ 
ously initiated storage accesses have been per¬ 
formed with respect to other processors and 
mechanisms. 
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Chapter 8. Time Base 


The Time Base (TB) is a 64-bit register (see 
Figure 37) containing a 64-bit unsigned integer which 
is incremented periodically. Each increment adds 1 to 
the low-order bit (bit 63). The frequency at which the 
counter is updated is implementation-dependent. 


TBU 


TBL 


o 


32 


63 


Field Description 

TBU Upper 32 bits of Time Base 

TBL Lower 32 bits of Time Base 

Figure 37. Time Base 

The Time Base increments until its value becomes 
OxFFFF_FFFF_FFFF_FFFF (2 s4 - 1). At the next incre¬ 
ment, its value becomes 0x0000_0000_0000_0000. 
There is no explicit indication (such as an interrupt: 
see Part 3, “PowerPC Operating Environment 
Architecture” on page 141) that this has occurred. 


The period of the Time Base depends on the driving 
frequency. As an order of magnitude example, 
suppose that the CPU clock is 100 MHz and that the 
Time Base is driven by this frequency divided by 32. 
Then the period of the Time Base would be 

t tb = = 590 x ,0 ' 2 seconds 


which is approximately 187,000 years. 

The PowerPC Architecture does not specify a relation¬ 
ship between the frequency at which the Time Base is 
updated and other frequencies, such as the CPU clock 
or bus clock, in a PowerPC system. The Time Base 
update frequency is not required to be constant. 
What is required, so that system software can keep 
time of day and operate interval timers, is that either: 

■ The system provides an (implementation- 
dependent) interrupt to software whenever the 
update frequency of the Time Base changes, and 


also a means to determine what the current 
update frequency is; or 

■ The update frequency of the Time Base is under 
the control of the system software. 

- Programming Note - 

If the operating system initializes the Time Base 
on power-on to some reasonable value and the 
update frequency of the Time Base is constant, 
the Time Base can be used as a source of values 
which increase at a constant rate, such as for 
time stamps in trace entries. 

Even if the update frequency is not constant, 
values read from the Time Base are 
monotonically increasing (except when the Time 
Base wraps from 2 64 —1 to 0). If a trace entry is 
recorded each time the update frequency 
changes, the sequence of Time Base values can 
be post-processed to become actual time values. 


8.1 Time Base Instructions 


Extended mnemonics 

A pair of extended mnemonics is provided for the 
mftb instruction so that it can be coded with the TBR 
name as part of the mnemonic rather than as a 
numeric operand. See the Assembler Extended Mne¬ 
monics appendix in Part 3, “PowerPC Operating Envi¬ 
ronment Architecture” on page 141. 


Move From Time Base XFX-form 

mftb RT,TBR 


31 

RT 

tbr 

371 

/ 

0 

6 


21 

31 


Chapter 8. Time Base 137 







n 4 - tbr 5:9 || tbr 0:4 
if n = 268 then 

if (64-bit implementation) then 
RT 4- TB 
else 

RT 4- 1632-53 
else if n = 269 then 

if (64-bit implementation) then 
RT 4- 32 0 || TBo;3i 

else 

RT 4- TB 0 31 

The TBR field denotes either the Time Base or Time 
Base Upper, encoded as shown in Figure 38. The 
contents of the designated register are placed into 
register RT. When reading Time Base Upper on a 
64-bit implementation, the high-order 32 bits of reg¬ 
ister RT are set to zero. 


decimal 

TBR* 

tbr 5:9 tbr 0:4 

Register 

name 

Privi¬ 

leged 

268 

01000 01100 

TB 

no 

269 

01000 01101 

TBU 

no 

* Note that the order of the two 5-bit halves 
of the TBR number is reversed. 


Figure 38. TBR encodings for mftb 


If the TBR field contains any value other than one of 
the values shown above, the instruction form is 
invalid. 

Special Registers Altered: 

None 

Extended Mnemonics: 

Extended mnemonics for Move From Time Base : 

Extended: Equivalent to: 

mftb Rt mftb Rt,268 

mftbu Rt mftb Rt,269 


— Compiler and Assembler Note - 

The TBR number coded in assembler language 
does not appear directly as a 10-bit binary 
number in the instruction. The number coded is 
split into two 5-bit halves that are reversed in the 
instruction, with the high-order 5 bits appearing in 
bits 16:20 of the instruction and the low-order 5 
bits in bits 11:15. 


8.2 Reading the Time Base on 
64-bit Implementations 

The contents of the Time Base may be read into a 
GPR by the mftb extended mnemonic. To read the 
contents of the Time Base into register Rx, execute: 

mftb Rx 

Reading the Time Base has no effect on the value it 
contains or the periodic incrementing of that value. 

8.3 Reading the Time Base on 
32-bit Implementations 

On 32-bit implementations, it is not possible to read 
the entire 64-bit Time Base in a single instruction. 
The mftb extended mnemonic moves from the lower 
half of the Time Base (TBL) to a GPR, and the mftbu 
extended mnemonic moves from the upper half (TBU) 
to a GPR. 

Because of the possibility of a carry from TBL to TBU 
occurring between reads of TBL and TBU, a sequence 
such as the following is necessary to read the Time 
Base on 32-bit implementations. 

loop: 

mftbu Rx # load from TBU 

mftb Ry # load from TBL 

mftbu Rz # load from TBU 

cmpw Rz.Rx # see if 'old' — 'new' 
bne loop # loop if carry occurred 

The comparison and loop are necessary to ensure 
that a consistent pair of values has been obtained. 

8.4 Computing Time of Day 
from the Time Base 

Since the update frequency of the Time Base is 
implementation-dependent, the algorithm for con¬ 
verting the current value in the Time Base to time of 
day is also implementation-dependent. 

As an example, assume that the Time Base is incre¬ 
mented at a constant rate of once for every 32 cycles 
of a 100 MHZ CPU instruction clock. What is wanted 
is the pair of 32-bit values comprising a POSIX 


— Programming Note - 

mftb serves as both a basic and an extended 
mnemonic. The assembler will recognize an mftb 
mnemonic with two operands as the basic form, 
and an mftb mnemonic with one operand as the 
extended form. Another way of saying this is that 
if mftb is coded with one operand, then that 
operand is assumed to be RT, and TBR defaults to 
the value corresponding to TB. 
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standard clock 1 : the number of whole seconds which 
have passed since midnight January 0, 1970, and the 
remaining fraction of a second expressed as a 
number of nanoseconds. 

Assume that: 

■ The value 0 in the Time Base represents the start 
time of the POSIX clock (if this is not true, a 
simple 64-bit subtraction will make it so). 

■ Integer constant ticks_per_sec contains the value 

=3,125,000 

which is the number of times the Time Base is 
updated each second. 

■ Integer constant nsjadj contains the value 

1 , 000 , 000,000 

" 3 ,125. 0 00 - 320 

which is the number of nanoseconds per tick of 
the Time Base. 

64-bit Implementations 


The POSIX clock can be computed with an instruction 
sequence such as this: 

mftb Ry # Ry = Time Base 

Iwz Rx,ticks_per_sec 
divd Rz,Ry,Rx # Rz = whole seconds 
stw Rz,posix_sec 

mulld Rz.Rz.Rx # Rz = quotient * divisor 
sub Rz,Ry,Rz # Rz = excess ticks 
iwz Rx,ns_adj 

mulld Rz,Rz,Rx # Rz = excess nanoseconds 
stw Rz,posix_ns 

32-bit Implementations 

On a 32-bit machine, direct implementation of the 
code given above for 64-bit machines is awkward, due 
mainly to the difficulty of doing 64-bit division. 2 Such 
division can be avoided entirely if a time of day clock 
in POSIX format is updated at least once each second. 

Assume that: 

■ The operating system maintains the following var¬ 
iables: 

— posixjb (64 bits) 

— posixjsec (32 bits) 

— posix_ns (32 bits) 


These variables hold the value of the Time Base 
and the computed POSIX seconds and 
nanoseconds values from the last time the POSIX 
clock was computed. 

■ The operating system arranges for an interrupt 
(see Part 3, “PowerPC Operating Environment 
Architecture” on page 141) to occur at least once 
per second, at which time it recomputes the 
POSIX clock values. 

■ The integer constant billion contains the value 

1,000,000,000. 

The POSIX clock can be computed with an instruction 
sequence such as this: 

loop: 

mftbu Rx # Rz = TBU 

mftb Ry # Ry = TBL 

mftbu Rz # Rz = 'new' TBU value 

cmpw Rz,Rx # see if 'old' = 'new' 
bne loop # loop if carry occurred 

# now have 64-bit TB in Rx and Ry 


Iwz 

Rz,posix_tb + 4 

sub 

Rz,Ry,Rz 

# Rz = delta in ticks 

Iwz 

Rw,ns_adj 


muilw Rz,Rz,Rw 

# Rz = delta in ns 

Iwz 

Rw,posix_ns 


add 

Rz,Rz,Rw 

# Rz = new ns value 

Iwz 

Rw,billion 


cmpw Rz,Rw 

# see if past 1 sec 

bit 

nochange 

# branch if not 

sub 

Rz,Rz,Rw 

# adjust nanoseconds 

Iwz 

Rw,posix_sec 

addi 

Rw,Rw,1 

# adjust seconds 

stw 

Rw,posix_sec # store new seconds 

nochange: 


stw 

Rz,posix_ns 

# store new ns 

stw 

Rx,posix_tb 

# store new time base 


stw Ry,posix_tb + 4 

Note that the upper part of the Time Base does not 
participate in the calculation to determine the new 
POSIX time of day. This is correct as long as the 
delta value does not exceed one second. 

Non-constant update frequency 

In a system in which the update frequency of the Time 
Base may change over time, it is not possible to 
convert an isolated Time Base value into time of day. 
Instead, a Time Base value has meaning only with 
respect to the current update frequency and the time 
of day that the update frequency was last changed. 
Each time the update frequency changes, either the 
system software is notified of the change via an inter¬ 
rupt (see Part 3, “PowerPC Operating Environment 


1 Described in POSIX Draft Standard P1003.4/D12, Draft Standard for Information Technology — Portable Operating System Interface (POSIX) - 
Part 1: System Application Program Interface (API) - Amendment 1: Realtime Extension [C Language']. Institute of Electrical and Electronics 
Engineers, Inc., Feb. 1992. 

2 See D. E. Knuth, The Art of Computer Programming, Volume 2, Seminumerical Algorithms, section 4.3.1, Algorithm D. Addison-Wesley, 1981. 
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Architecture” on page 141}, or else the change was 
instigated by the system software itself. At each such 
change, the system software must compute the 
current time of day using the old update frequency, 
compute a new value of ticks_per_second for the new 
frequency, and save the time of day, Time Base 
value, and tick rate. Subsequent calls to compute 
time of day use the current Time Base value and the 
saved data. 


- Programming Note - 

A generalized service to compute time of day 
could take the following as input. 

1. Time of day at beginning of current epoch 

2. Time Base value at beginning of current 
epoch 

3. Time Base update frequency 

4. Time Base value for which time of day is 
desired 

For a PowerPC system in which the Time Base 
update frequency does not vary, the first three 
inputs would be constant. 
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Part 3. PowerPC Operating Environment Architecture 


This part defines the additional instructions and facili¬ 
ties, beyond those of the PowerPC User Instruction 
Set Architecture and PowerPC Virtual Environment 


Architecture. It covers instructions and facilities not 
available to the application programmer, affecting 
storage control, interrupts, and timing facilities. 
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Chapter 9. Introduction 


9.1 Overview 

Part 1, “PowerPC User Instruction Set Architecture” 
on page 1 describes computation modes, compat¬ 
ibility with the Power Architecture, document con¬ 
ventions, a general systems overview, instruction 
formats, and storage addressing. This chapter aug¬ 
ments that description as necessary for the PowerPC 
Operating Environment Architecture. 

9.2 Compatibility with the Power 
Architecture 

The PowerPC Architecture provides binary compat¬ 
ibility for Power application programs, except as 
described in the “Incompatibilities with the Power 
Architecture” appendix of Part 1, “PowerPC User 
Instruction Set Architecture” on page 1. Binary com¬ 
patibility is not necessarily provided for privileged 
Power instructions. 


9.3 Document Conventions 

The notation and terminology used in Book I applies 
to this document also, with the following substitutions. 

■ For “system alignment error handler” substitute 
“Alignment interrupt.” 

■ For “system data storage error handler” substi¬ 
tute “Data Storage interrupt.” 

■ For “system error handler” substitute “interrupt.” 

■ For “system floating-point assist error handler” 
substitute “Floating-Point Assist interrupt.” 

■ For “system floating-point enabled exception 
error handler” substitute “Floating-Point Enabled 
Exception type Program interrupt.” 

■ For “system floating-point unavailable error 
handler” substitute “Floating-Point Unavailable 
interrupt." 


■ For “system illegal instruction error handler” sub¬ 
stitute “Illegal Instruction type Program 
Interrupt.” 

■ For “system instruction storage error handler” 
substitute “Instruction Storage interrupt.” 

■ For “system privileged instruction error handler" 
substitute “Privileged Instruction type Program 
interrupt.” 

■ For “system service program” substitute “System 
Call interrupt.” 

■ For “system trap handler” substitute “Trap type 
Program interrupt.” 

9.3.1 Definitions and Notation 

The following augments the definitions given in Book 

I. 

■ The context of a program is defined by the 
content of the MSR when the program is exe¬ 
cuting. It defines the manner in which the 
program accesses and executes instructions, 
accesses data, controls interrupts, accesses the 
floating-point unit, and interprets addresses or 
fixed-point data (32 bits or 64 bits). 

■ An exception is an error, unusual condition, or 
external signal, that may set a status bit, and 
which may or may not cause an interrupt, 
depending upon whether or not the corresponding 
interrupt is enabled. 

■ An interrupt is the act of changing the machine 
state in response to an exception, as described in 
Chapter 13, “Interrupts” on page 193. 

■ A trap interrupt is an interrupt that results from 
execution of a Trap instruction. 

■ Hardware means any combination of hard-wired 
implementation, “fast trap” to implementation- 
dependent software assistance, or interrupt for 
software assistance. In the last case, the inter¬ 
rupt may be to an architected location or to an 
implementation-dependent location. Any use of 
fast traps or interrupts to implement the architec- 
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ture is described in Book IV, PowerPC Implemen¬ 
tation Features. 

■ /, II, III, ... denotes a field that is reserved in an 
instruction, a register, or in an architected 
storage table. 

9.3.2 Reserved Fields 

System software should initialize reserved fields in 
architected storage tables (Segment Table, Page 
Table) to Os and not keep data in them, as the fields 
may be used in the future by subsequent versions of 
PowerPC Architecture. 

Some fields of certain storage tables may be written 
to automatically by hardware, e.g. Reference and 
Change bits in the Page Table. When the hardware 
writes to such a table, the following rules must be fol¬ 
lowed: 

■ No defined field other than the one(s) the hard¬ 
ware is specifically updating may be modified. 

■ Contents of reserved fields may be preserved by 
hardware or such fields may be written as Os. No 
other changes to reserved fields may be made. 

The handling of reserved bits in status and control 
registers described in Book I applies here as well. In 
addition, the reader should be cognizant that reading 
and writing of some of these registers (e.g., the MSR) 
can occur as a side effect of processing an interrupt 
and of returning from an interrupt, as well as when 
requested explicitly by the appropriate instruction 
(e.g., mtmsr). 

9.3.3 Description of instruction 
Operation 

The following augments the definitions given in Book I 
in the description of the RTL 

Notation Meaning 

SEGREG(x) Segment Register x 


9.4 General Systems Overview 

The processor or processor unit contains the 
sequencing and processing controls for instruction 
fetch, instruction execution and interrupt action. 
Instructions that the processing unit can execute fall 
into a number of classes: 

■ instructions executed in the Branch Processor 

■ instructions executed in the Fixed-Point Processor 


■ instructions executed in the Floating-Point 
Processor 

Almost ail instructions executed in the Branch 
Processor, Fixed-Point Processor, and Floating-Point 
Processor are non-privileged and are described in 
Part 1, “PowerPC User Instruction Set Architecture” 
on page 1. Part 2, “PowerPC Virtual Environment 
Architecture” on page 117 contains some cache man¬ 
agement instructions. Instructions related to the priv¬ 
ileged state of the processor, control of processor 
resources, control of the storage hierarchy, and all 
other privileged instructions are described here or in 
Book IV, PowerPC Implementation Features. 


9.5 Instruction Formats 

See Part 1, “PowerPC User Instruction Set 
Architecture” on page 1 for a description of the 
instruction formats and addressing. 

9.5.1 Instruction Fields 

The following augments the instruction fields 
described in Book I. 

SPR (11:20) 

Special Purpose Register 

See the descriptions of the mtspr (page 79) and 
mfspr (page 80) instructions for a list of SPR 
encodings. 

SR (12:15) 

Field used to specify one of the 16 Segment Reg¬ 
isters. 


9.6 Exceptions 

The following augments the list, given in Book I, of 
exceptions that can be caused by the execution of an 
instruction. 

■ the execution of a Load or Store instruction to a 
direct-store segment, in a manner that causes an 
exception (direct-store error exception) 

■ the execution of a traced instruction (Trace 
exception) 


9.7 Synchronization 

The synchronization described in this section refers to 
the state of the processor that is performing the syn¬ 
chronization. 
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DIRECT MEMORY ACCESS 


Figure 39. Logical View of the PowerPC Processor 
Architecture 

9.7.1 Context Synchronization 

An instruction or event is “context synchronizing” if it 
satisfies the requirements listed below. Such 
instructions and events are collectively called 
“context synchronizing operations.” Examples of 
context synchronizing operations include the sc 
instruction (see Part 1, “PowerPC User Instruction Set 
Architecture” on page 1), the rfi instruction, and most 
interrupts. 

1. The operation causes instruction dispatching (the 
issuance of instructions by the instruction fetch 
mechanism to any instruction execution mech¬ 
anism) to be halted. 

2. The operation is not initiated or, in the case of 
isync, is not completed, until all instructions 
already in execution have completed to a point at 
which they have reported all exceptions they will 
cause. (If a storage access due to a previously 
initiated instruction may cause one or more 
Direct-Store Error exceptions, the determination 
of whether it does cause such exceptions is made 
before the operation is initiated.) 


3. If the operation directly causes an interrupt (e.g., 
sc directly causes a System Call interrupt) or is 
an interrupt, the operation is not initiated until no 
exception exists having higher priority than the 
exception associated with the interrupt (see 
Section 13.8, “Interrupt Priorities” on page 203). 

4. The instructions that precede the operation will 
complete execution in the context (privilege, relo¬ 
cation, storage protection, etc.) in which they 
were initiated. 

5. The instructions that follow the operation will be 
fetched and executed in the context established 
by the operation. (This requires that any pre¬ 
fetched instructions be discarded, which in turn 
requires that any effects and side effects of spec¬ 
ulatively executing them also be discarded. The 
only side effects of these instructions that are 
permitted to survive are those specified in 
Section 12.2.5, “Speculative Execution” on 
page 159.) 

A context synchronizing operation is necessarily exe¬ 
cution synchronizing; see Section 9.7.2, “Execution 
Synchronization.” Unlike the sync instruction (see 
Part 2, “PowerPC Virtual Environment Architecture” 
on page 117), a context synchronizing operation need 
not wait for storage-related operations to complete on 
other processors, nor for Reference and Change bits 
in the Page Table (see Chapter 12, “Storage Control” 
on page 157) to be updated. 

9.7.2 Execution Synchronization 

An instruction is “execution synchronizing” if all pre¬ 
viously initiated instructions appear to have com¬ 
pleted before the instruction is initiated or, in the case 
of sync and isync, before the instruction completes. 
Examples of execution synchronizing instructions are 
sync (see Part 1, “PowerPC User Instruction Set 
Architecture” on page 1) and mtmsr. Also, all 
context synchronizing instructions (see Section 9.7.1) 
are execution synchronizing. 

Unlike a context synchronizing operation, an exe¬ 
cution synchronizing instruction need not ensure that 
the instructions following that instruction will execute 
in the context established by that instruction. This 
new context becomes effective sometime after the 
execution synchronizing instruction completes and 
before or at a subsequent context synchronizing oper¬ 
ation. 
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Chapter 10. Branch Processor 


10.1 Branch Processor Overview 

This chapter describes the details concerning the reg¬ 
isters and the privileged instructions implemented in 
the Branch Processor that are in addition to those 
shown in Part 1, “PowerPC User Instruction Set 
Architecture” on page 1. 


— Programming Note - 

In some implementations, every instruction fetch 
when MSR, r = 1, and every instruction execution 
requiring address translation when MSR dr = 1, 
may have the side effect of modifying SRRO. For 
further details see the Book IV, PowerPC Imple¬ 
mentation Features document for the implementa¬ 
tion. 


10.2 Branch Processor Registers 

10.2.1 Machine Status Save/Restore 
Register 0 

The Machine Status Save/Restore Register 0 (SRRO) 
is a 32-bit or 64-bit register depending on the version 
of the architecture implemented. This register is used 
to save machine status on interrupts, and to restore 
machine status when a Return From Interrupt ( rfi ) 
instruction is executed. 


10.2.2 Machine Status Save/Restore 
Register 1 

The Machine Status Save/Restore Register 1 (SRR1) 
is a 32-bit or 64-bit register depending on the version 
of the architecture implemented. This register is used 
to save machine status on interrupts, and to restore 
machine status when an rfi instruction is executed. 


SRR1 

0 63 (31} 


On interrupt, SRRO is set to the current or next 
instruction address. Thus if the interrupt occurs in 
32-bit mode, the high-order 32 bits of SRRO are set to 
0. When rfi is executed, the contents of SRRO are 
copied to the current instruction address (CIA), except 
that the high-order 32 bits of the CIA are set to 0 
when returning to 32-bit mode. 


Figure 41. Save/Restore Register 1 

In general, when an interrupt occurs, bits 33:36 and 
42:47 {1:4 and 10:15} of SRR1 are loaded with infor¬ 
mation specific to the interrupt type, and bits 0:32, 
37:41, and 48:63 {0, 5:9, and 16:31} of the MSR are 
placed into the corresponding bit positions of SRR1. 


SRRO 


// 


0 61 63 

0 {29X31} 


Figure 40. Save/Restore Register 0 


In general, SRRO contains the instruction address that 
caused the interrupt, or the instruction address to 
return to after an interrupt is serviced. 


— Programming Note - 

In some implementations, every instruction fetch 
when MSR| R = 1, and every instruction execution 
requiring address translation when MSR dr = 1, 
may have the side effect of modifying SRR1. For 
further details see the Book IV, PowerPC Imple¬ 
mentation Features document for the implementa¬ 
tion. 


Chapter 10. Branch Processor 147 






10.2.3 Machine State Register 

The Machine State Register (MSR) is a 32-bit or 64-bit 
register depending on the version of the architecture 
implemented. This register defines the state of the 
processor. On interrupt, the MSR bits are altered in 
accordance with Figure 68 on page 195. The MSR 
can also be modified by the mtmsr, sc, and rfi 
instructions. It can be read by the mfmsr instruction. 


MSR 

0 63 {31} 

Figure 42. Machine State Register 

Below are shown the bit definitions for the Machine 
State Register. The notation “full function” on a 
reserved bit means that it is saved in SRR1 when an 
interrupt occurs. The notation “partial function” 
means that it is not saved. 

Bit(s) Description 

0 Sixty-Four-bit mode (SF) 

0 the processor runs in 32-bit mode. 

1 the processor runs in 64-bit mode. 

1:32 {0} Reserved full function 

33:36 {1:4} Reserved partial function 

37:41 {5:9} Reserved full function 

42:44 {10:12} Reserved partial function 

45 {13} Power Management Enable (POW) 

0 power management disabled (normal 
operation mode). 

1 power management enabled (reduced 
power mode). 

Power management functions are 
implementation-dependent. For further 
descriptions of the effect of this bit, see the 
Book IV, PowerPC Implementation Features 
document for the implementation. 

46 {14} Implementation-Dependent Function 

See the Book IV, PowerPC Implementation 
Features document for the implementation. 

47 {15} Interrupt Little-Endian Mode (ILE) 

When an interrupt is taken, this bit is copied 
into MSR LE to select the Endian mode for the 
context established by the interrupt. 

48 {16} External Interrupt Enable (EE) 

0 the processor is disabled against External 
and Decrementer interrupts. 

1 the processor is enabled to take an 
External or Decrementer interrupt. 

49 {17} Problem State (PR) 

0 the processor is privileged to execute any 
instruction 


1 the processor can only execute the non- 
privileged instructions. 

MSR pr also affects storage protection, as 
described in Chapter 12, “Storage Control” 
on page 157. 

50 {18} Floating-Point Available (FP) 

0 the processor cannot execute any floating¬ 
point instructions, including floating-point 
loads, stores and moves. 

1 the processor can execute floating-point 
instructions. 

51 {19} Machine Check Enable (ME) 

0 Machine Check interrupts are disabled. 

1 Machine Check interrupts are enabled. 

52 {20} Floating-Point Exception Mode 0 (FE0) 

See below. 

53 {21} Single-Step Trace Enable (SE) 

0 the processor executes instructions 
normally. 

1 the processor generates a Single-Step 
type Trace interrupt upon the successful 
execution of the next instruction. Suc¬ 
cessful execution means the instruction 
caused no other interrupt. See Book IV, 
PowerPC Implementation Features. 

Single-step tracing may not be present on all 
implementations. If the function is not imple¬ 
mented, MSR se should be treated as a 
reserved MSR bit: mfmsr may return the last 
value written to the bit, or may return 0 
always. 

54 {22} Branch Trace Enable (BE) 

0 the processor executes branch 

instructions normally. 

1 the processor generates a Branch type 
Trace interrupt after completing the exe¬ 
cution of a branch instruction, whether or 
not the branch is taken. See Book IV, 
PowerPC Implementation Features. 

Branch tracing may not be present on ail 
implementations. If the function is not imple¬ 
mented, MSR be should be treated as a 
reserved MSR bit: mfmsr may return the last 
value written to the bit, or may return 0 
always. 

55 {23} Floating-Point Exception Mode 1 (FE1) 

See below. 

56 {24} Reserved full function 

This bit corresponds to the AL bit of the 
Power Architecture. It will not be assigned 
new meaning in the near future. As for any 
other reserved bit in a register, software is 
permitted to write the value 1 to this bit, but 
there is no guarantee that a subsequent 
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reading of this bit will yield the value that 
software “wrote” there. 

- Programming Note - 

Power-compatible operating systems will 
probably write the value 1 to this bit. 


57 {25} Interrupt Prefix (IP) 

In the following description, nnnnn is the 
offset of the interrupt. See Figure 69 on 
page 195. 

0 interrupts vectored to the real address 
OxOOOnnnnn in 32-bit versions and real 
address 0x0000_0000_000n_nnnn in 64-bit 
versions 

1 interrupts vectored to the real address 
OxFFFn_nnnn in 32-bit versions and real 
address OxFFFF_FFFF_FFFn_nnnn in 64 bit 
versions. 

58 {26} Instruction Relocate (IR) 

0 instruction address translation is off. 

1 instruction address translation is on. 

59 {27} Data Relocate (DR) 

0 data address translation is off. 

1 data address translation is on. 

60 {28} Reserved full function 

61 {29} Reserved full function 

62 {30} Recoverable Interrupt (Rl) 

0 interrupt is not recoverable. 

1 interrupt is recoverable. 

Additional information about the use of this 
bit is given in Sections 13.4, “Interrupt 
Processing” on page 194, 13.5.1, “System 
Reset Interrupt” on page 196, and 13.5.2, 
“Machine Check Interrupt” on page 196. 

63 {31} Little-Endian Mode (LE) 

0 the processor runs in Big-Endian mode. 

1 the processor runs in Little-Endian mode. 

The Floating-Point Exception Mode bits are inter¬ 
preted as shown below. For further details see 
Part 1, “PowerPC User Instruction Set Architecture” 
on page 1. 

FEO FE1 Mode 

0 0 Interrupts disabled 

0 1 Imprecise Nonrecoverable 

1 0 Imprecise Recoverable 

1 1 Precise 


10.2.4 Processor Version Register 

The Processor Version Register is a 32-bit read-only 
register that contains a value identifying the specific 
version (model) and revision level of the PowerPC 
processor. The contents of the PVR can be copied to 
a GPR by the mfspr instruction. Read access to the 
PVR is privileged; write access is not provided. 


Version 


Revision 


0 16 31 


Figure 43. Processor Version Register 


The PVR contains two fields: 

Version A 16-bit number that uniquely determines 
a particular processor version and 
version of the PowerPC Architecture. 
This number can be used to determine 
the version of a processor; it may not dis¬ 
tinguish between different product models 
if more than one model uses the same 
processor. 

Revision A 16-bit number that distinguishes 
between various releases of a particular 
version, i.e. an Engineering Change level. 

The value of the Version portion of the PVR is 
assigned by the PowerPC Architecture process. 
Values assigned to date are listed in — Heading 'PVN' 
unknown —. 


The value of the Revision portion of the PVR is imple¬ 
mentation defined. 
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10.3 Branch Processor Instructions 


10.3.1 System Linkage Instructions 

These instructions provide the means by which a 
program can call upon the system to perform a 
service, and by which the system can return from per¬ 
forming a service or from processing an interrupt. 

These instructions are context synchronizing, as 
defined in Section 9.7.1, “Context Synchronization” on 
page 145. 

The System Call instruction is described in Part 1, 
“PowerPC User Instruction Set Architecture” on 
page 1, but only at the level required by an applica¬ 
tion programmer. A complete description of this 
instruction appears below. 


System Call SC-form 


sc 

[Power mnemonic: svca] 


17 

III 

III 

III 

1 

D 

0 

6 

11 

16 

30 

1 31 | 


SRR0 <- 1ea CIA + 4 

SRRl 33:36 42:47 {i :4 10 : 15 } undefined 

SRRI0 32 37;4i 48:63 {0 5:9 16:31) * MSR 0:32 37:41 48:63 (0 5:9 16:31} 

MSR <- new_va1ue (see below) 

NIA <- iea base_ea + 0xC00 (see below) 

The effective address of the instruction following the 
System Call instruction is placed into SRR0. Bits 0:32, 
37:41, and 48:63 {0, 5:9, and 16:31} of the MSR are 
placed into the corresponding bits of SRR1, and bits 
33:36 and 42:47 {1:4 and 10:15} of SRR1 are set to 
undefined values. 

Then a System Call interrupt is generated. The inter¬ 
rupt causes the MSR to be altered as described in 
Section 13.5, “Interrupt Definitions” on page 195. 

The interrupt causes the next instruction to be fetched 
from offset OxCOO from the base real address indi¬ 
cated by the new setting of MSR, P . 

This instruction is context synchronizing. 

Special Registers Altered: 

SRR0SRR1 MSR 


— Compatibility Note - 

For a discussion of Power compatibility with 
respect to instruction bits 16:29, please refer to 
the “Incompatibilities with the Power 
Architecture” appendix of Part 1, “PowerPC User 
Instruction Set Architecture” on page 1. For com¬ 
patibility with future versions of this architecture, 
these bits should be coded as zero. 


Return From Interrupt XL-form 

rfi 


19 

III 

III 

III 

50 

/ 

0 

6 

11 

16 

21 

31 


MSR 0 ;32 37:41 48:63 {0 5:9 16:31} * SRR1<):32 37:41 48:63 (0 5:9 16:31} 
NIA <-i ea SRR00.gj{o ; 29} II Qb00 

Bits 0:32, 37:41, and 48:63 {0, 5:9, and 16:31} of SRR1 
are placed into the corresponding bits of the MSR. 
Then the next instruction is fetched, under control of 
the new MSR value, from the address 
SRR 0 0 ; 61 {O: 2 9 } || ObOO (32-bit implementations, and 
64-bit implementations when SF=1 in the new MSR 
value) or 32 0 || SRR0 32 61 || ObOO (64-bit implementa¬ 
tions when SF = 0 in the new MSR value). 

If this instruction enables any pending exceptions, the 
interrupt associated with the highest priority pending 
exception is generated. 


This instruction is privileged and context synchro¬ 
nizing. 

Special Registers Altered: 
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Chapter 11. Fixed-Point Processor 


11.1 Fixed-Point Processor 
Overview 

This chapter describes the details concerning the reg¬ 
isters and the privileged instructions implemented in 
the Fixed-Point Processor that are in addition to those 
shown in Part 1, “PowerPC User Instruction Set 
Architecture” on page 1. 

11.2 PowerPC Special Purpose 
Registers 

The Special Purpose Registers are read and written 
via the mfspr (page 79) and mtspr (page 79) 
instructions. The descriptions of these instructions 
list the valid encodings of SPR numbers. Encodings 
not listed are reserved for future use or for use as 
implementation-specific registers. 

Most SPRs are defined in other parts of this book; see 
the index to locate those definitions. Some SPRs are 
specific to an implementation. See Appendix M, 
“Implementation-Specific SPRs” on page 273 and 
Book IV, PowerPC Implementation Features. 

11.3 Fixed-Point Processor 
Registers 

11.3.1 Data Address Register 

The Data Address Register (DAR) is a 32-bit or 64-bit 
register depending on the version of the architecture 
implemented. See Sections 13.5.3, “Data Storage 
Interrupt” on page 194, and 13.5.6, “Alignment 
Interrupt” on page 196. 

When an interrupt that uses the DAR occurs, the DAR 
is set to the effective address associated with the 
interrupting instruction. If the interrupt occurs in 


32-bit mode, the high-order 32 bits of the DAR are set 
to 0. 


DAR 

0 63 {31} 

Figure 44. Data Address Register 

11.3.2 Data Storage Interrupt Status 
Register 

The Data Storage Interrupt Status Register (DSISR) is 
a 32-bit register that defines the cause of Data 
Storage and Alignment interrupts. See Sections 
13.5.3, “Data Storage Interrupt” on page 194 and 
13.5.6, “Alignment Interrupt” on page 196. 


DSISR 

0 31 

Figure 45. Data Storage Interrupt Status Register 

11.3.3 Software-use SPRs 


SPRGO through SPRG3 are 64-bit {32-bit} registers 
provided for operating system use. 



0 63 {31} 


Figure 46. Software-use SPRs 

The following list describes the conventional uses of 
SPRGO through SPRG3. 

SPRGO 

Software may load a unique real address in this 
register to identify an area of storage reserved for 
use by the first level interrupt handler. This area 
must be unique for each processor in the system; 
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SPRG1 

This register may be used as a scratch register by 
the first level interrupt handler to save the content 
of a GPR. That GPR then can be loaded from 
SPRGO and used as a base register to save other 
GPR's to storage. 

SPRG2 

This register may be used by the operating system 
as needed. 

SPRG3 

This register may be used by the operating system 
as needed. 

11.4 Fixed-Point Processor 
Privileged Instructions 

11.4.1 Move To/From System 
Registers Instructions 

The Move To Special Purpose Register and Move 
From Special Purpose Register instructions are 
described in Part 1, “PowerPC User Instruction Set 
Architecture” on page 1, but only at the level avail¬ 
able to an application programmer. In particular, no 
mention is made there of registers that can be 
accessed only in privileged state. A complete 
description of these instructions appears below. 

Extended mnemonics 

A set of extended mnemonics is provided for the 
mtspr and mfspr instructions so that they can be 
coded with the SPR name as part of the mnemonic 
rather than as a numeric operand. See Appendix C, 
“Assembler Extended Mnemonics” on page 221. 


Move To Special Purpose Register 
XFX-form 


mtspr SPR.RS 


31 

RS 

spr 

467 

/ 

0 

6 


21 

31 


n = spr 5:9 || spr 0;4 
If length(SPREG(n)) = 64 then 
SPREG(n) <- (RS) 
else 

SPREG(n) «- (RS) 3 203 {o: 3 i} 

The SPR field denotes a Special Purpose Register, 
encoded as shown in Figure 47 on page 153. The 
contents of register RS are placed into the designated 
Special Purpose Register. For Special Purpose Regis¬ 
ters that are 32 bits long, the low-order 32 bits of RS 
are placed into the SPR. 

For this instruction, SPRs TBL and TBU are treated as 
separate 32-bit registers; setting one leaves the other 
unaltered. 

spr 0 = 1 if and only if writing the register is privileged. 
Execution of this instruction specifying a defined and 
privileged register when MSR pr = 1 will result in a 
Privileged Instruction type Program interrupt. 

Additional values of the SPR field, beyond those 
shown in Figure 47 on page 153, may be defined in 
Book IV, PowerPC Implementation Features for the 
implementation (see also Appendix M, 
“Implementation-Specific SPRs” on page 273). If the 
SPR field contains any value other than one of these 
implementation-specific values or one of the values 
shown in the Figure, the instruction form is invalid. 
However, if MSR pr =1 then the only effect of exe¬ 
cuting an invalid instruction form in which spr 0 =1 is 
to cause either a Privileged Instruction type Program 
interrupt or an illegal Instruction type Program inter¬ 
rupt. 

Special Registers Altered: 

See Figure 47 on page 153 


— Compiler and Assembler Note - 

For the mtspr and mfspr instructions, the SPR 
number coded in assembler language does not 
appear directly as a 10-bit binary number in the 
instruction. The number coded is split into two 
5-bit halves that are reversed in the instruction, 
with the high-order 5 bits appearing in bits 16:20 
of the instruction and the low-:order 5 bits in bits 
11:15. This maintains compatibility with Power 
SPR encodings, in which these two instructions 
had only a 5-bit SPR field occupying bits 11:15. 
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SPR 1 

Register 

Privi- 

decimal 

spr 5:9 spr 04 

name 

leged 

1 

00000 00001 

XER 

no 

8 

00000 01000 

LR 

no 

9 

00000 01001 

CTR 

no 

18 

00000 10010 

DSISR 

yes 

19 

00000 10011 

DAR 

yes 

22 

00000 10110 

DEC 

yes 

25 

00000 11001 

SDR1 

yes 

26 

00000 11010 

SRR0 

yes 

27 

00000 11011 

SRR1 

yes 

272 

01000 10000 

SPRG0 

yes 

273 

01000 10001 

SPRG1 

yes 

274 

01000 10010 

SPRG2 

yes 

275 

01000 10011 

SPRG3 

yes 

280 

01000 11000 

ASR 2 

yes 

282 

01000 11010 

EAR 

yes 

284 

01000 11100 

TBL 

yes 

285 

01000 11101 

TBU 

yes 

528 

10000 10000 

IBAT0U 

yes 

529 

10000 10001 

IBAT0L 

yes 

530 

10000 10010 

IBAT1U 

yes 

531 

10000 10011 

IBAT1L 

yes 

532 

10000 10100 

IBAT2U 

yes 

533 

10000 10101 

IBAT2L 

yes 

534 

10000 10110 

IBAT3U 

yes 

535 

10000 10111 

IBAT3L 

yes 

536 

10000 11000 

DBAT0U 

yes 

537 

10000 11001 

DBAT0L 

yes 

538 

10000 11010 

DBAT1U 

yes 

539 

10000 11011 

DBAT1L 

yes 

540 

10000 11100 

DBAT2U 

yes 

541 

10000 11101 

DBAT2L 

yes 

542 

10000 11110 

DBAT3U 

yes 

543 

10000 11111 

DBAT3L 

yes 

1 Note that the order of the two 5-bit halves 

of the SPR number is reversed. 


2 64-bit implementations only. 


Figure 47. SPR encodings for mtspr 


— Programming Note - 

For a discussion of software synchronization 
requirements when altering certain Special 
Purpose Registers, please refer to Appendix L, 
“Synchronization Requirements for Special 
Registers” on page 269. 


— Compatibility Note - 

For a discussion of Power compatibility with 
respect to SPR numbers not shown in the instruc¬ 
tion descriptions for mtspr and mfspr, please refer 
to the “Incompatibilities with the Power Architec¬ 
ture” appendix of Part 1, “PowerPC User Instruc¬ 
tion Set Architecture” on page 1. For 
compatibility with future versions of this architec¬ 
ture, only SPR numbers discussed in these 
instruction descriptions should be used. 


Move From Special Purpose Register 
XFX-form 


mfspr RT.SPR 


31 

RT 

spr 

339 

/ 

0 

6 

ii 

21 

31 


n <- spr 5;9 || spr 0;4 
if length(SPREG(n)) = 64 then 
RT <- SPREG(n) 
else 

RT <- 32 0 || SPREG(n) 

The SPR field denotes a Special Purpose Register, 
encoded as shown in Figure 48 on page 154. The 
contents of the designated Special Purpose Register 
are placed into register RT. For Special Purpose Reg¬ 
isters that are 32 bits long, the low-order 32 bits of RT 
receive the contents of the Special Purpose Register 
and the high-order 32 bits of RT are set to zero. 

spr 0 = 1 if and only if reading the register is privi¬ 
leged. Execution of this instruction specifying a 
defined and privileged register when MSR pr = 1 will 
result in a Privileged Instruction type Program inter¬ 
rupt. 

Additional values of the SPR field, beyond those 
shown in Figure 48 on page 154, may be defined in 
Book IV, PowerPC Implementation Features for the 
implementation (see also Appendix M, 
“Implementation-Specific SPRs” on page 273). If the 
SPR field contains any value other than one of these 
implementation-specific values or one of the values 
shown in the Figure, the instruction form is invalid. 
However, if MSR pr =1 then the only effect of exe¬ 
cuting an invalid instruction form in which spr 0 = 1 is 
to cause either a Privileged Instruction type Program 
interrupt or an Illegal Instruction type Program inter¬ 
rupt. 

Special Registers Altered: 

None 

- Compiler/Assembler/Compatibility Notes - 

See the Notes that appear with mtspr. 
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SPR 1 

Register 

Privi- 

decimal 

spr 5 9 spr 0;4 

name 

leged 

1 

00000 00001 

XER 

no 

8 

00000 01000 

LR 

no 

9 

00000 01001 

CTR 

no 

18 

00000 10010 

DSISR 

yes 

19 

00000 10011 

DAR 

yes 

22 

00000 10110 

DEC 

yes 

25 

00000 11001 

SDR1 

yes 

26 

00000 11010 

SRR0 

yes 

27 

00000 11011 

SRR1 

yes 

272 

01000 10000 

SPRG0 

yes 

273 

01000 10001 

SPRG1 

yes 

274 

01000 10010 

SPRG2 

yes 

275 

01000 10011 

SPRG3 

yes 

280 

01000 11000 

ASR 2 

yes 

282 

01000 11010 

EAR 

yes 

287 

01000 11111 

PVR 

yes 

528 

10000 10000 

IBATOU 

yes 

529 

10000 10001 

IBAT0L 

yes 

530 

10000 10010 

IBAT1U 

yes 

531 

10000 10011 

IBAT1L 

yes 

532 

10000 10100 

IBAT2U 

yes 

533 

10000 10101 

IBAT2L 

yes 

534 

10000 10110 

IBAT3U 

yes 

535 

10000 10111 

IBAT3L 

yes 

536 

10000 11000 

DBAT0U 

yes 

537 

10000 11001 

DBAT0L 

yes 

538 

10000 11010 

DBAT1U 

yes 

539 

10000 11011 

DBAT1L 

yes 

540 

10000 11100 

DBAT2U 

yes 

541 

10000 11101 

DBAT2L 

yes 

542 

10000 11110 

DBAT3U 

yes 

543 

10000 11111 

DBAT3L 

yes 

1 Note that the order of the two 5-bit halves 

of the SPR number is reversed. 


2 64-bit implementations only. 

Moving from the Time Base (TB and TBU) is 

accomplished with the mftb instruction. 

described in Book II. 




Figure 48. SPR encodings for mfspr 


Move To Machine State Register X-form 

mtmsr RS 


31 

RS 

III 

III 

146 

/ 

0 

6 

11 

16 

21 

31 


MSR «- (RS) 

The contents of register RS are placed into the MSR. 

This instruction is privileged and execution synchro¬ 
nizing. 

In addition, alterations to the EE and Rl bits are effec¬ 
tive as soon as the instruction completes. Thus if 
MSR ee = 0 and an External or Decrementer interrupt 
is pending, executing an mtmsr instruction that sets 
MSR ee to 1 will cause the External or Decrementer 
interrupt to be taken before the next instruction is 
executed. 

Special Registers Altered: 

MSR 

- Programming Note - 

For a discussion of software synchronization 
requirements when altering certain MSR bits, 
please refer to Appendix L, “Synchronization 
Requirements for Special Registers” on page 269. 


Move From Machine State Register 
X-form 


mfmsr RT 


31 

RT 

III 

III 

83 

/ 

0 

6 

11 

16 

21 

31 


RT «- MSR 

The contents of the MSR are placed into RT. 

This instruction is privileged. 

Special Registers Altered: 
none 
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Chapter 12. Storage Control 


12.1 Storage Addressing 

A program references storage using the Effective 
Address computed by the processor when it executes 
a load, store, branch, or cache instruction, and when 
it fetches the next sequential instruction. The effec¬ 
tive address is translated to a real address according 
to procedures described in section 12.3, “Address 
Translation Overview" on page 159 and following. 
The real address is what is sent to the memory sub¬ 
system. See Figure 49 on page 159. 

For a complete discussion of storage addressing and 
effective address calculation, refer to “Storage 
Addressing” in Chapter 1 of Part 1, “PowerPC User 
Instruction Set Architecture” on page 1. 

Storage Control Overview 

■ Page size is 2 12 bytes (4 KB) 

■ Segment size is 2 28 bytes (256 MB) 

■ For 64-bit implementations: 

— Maximum real memory size 2 64 bytes (16 EB) 
— Effective Address Range 2 s4 
— Virtual Address Range 2 80 
— Number of segments 2 52 

■ For 32-bit implementations: 

— Maximum real memory size 2 32 bytes (4 GB) 
— Effective Address Range 2 32 
— Virtual Address Range 2 52 
— Number of segments 2 24 

■ Two types of storage segments based on the 
state of the T bit in the Segment Table Entry or 
segment register selected by the Effective 
Address: 

— T=0: Ordinary storage segment 
— T=1: Direct-store segment 


12.2 Storage Model 

The storage model provides the following features: 

1. The architecture allows the storage implementa¬ 
tions to take advantage of the performance bene¬ 
fits of weak ordering of storage access between 
processors or between processors and devices. 

2. The architecture provides instructions that allow 
the programmer to ensure a consistent and 
ordered storage state. 

• dcbf 

• dcbst 

• dcbz 

• icbi 

• isync 

• Idarx 

3. Processor ordering: storage accesses by a single 
processor appear to complete sequentially from 
the view of the programming model but may com¬ 
plete out of order with respect to the ultimate 
destination in the storage hierarchy. Order is 
guaranteed at each level of the storage hierarchy 
for accesses to the same address from the same 
processor. 

4. Storage consistency between processors and 
between a processor and I/O is controlled by soft¬ 
ware through mode bits in the page table. See 
12.8.2, “Supported Storage Modes” on page 177. 
Six modes are supported using the control bits: 

■ Write Through 

■ Caching Inhibited 

■ Memory Coherence 

12.2.1 Storage Segments 

Storage is divided into 256 MB (2 28 ) segments. 

- Programming Note - 

It is possible to provide larger segments to appli¬ 
cation programs by using multiple adjacent seg¬ 
ments. 


• Iwarx 

• eieio 

• stdcx. 

• stwcx. 

• sync 
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These segments can be of two types: 

■ An ordinary storage segment, referred to as a 
“storage segment” or simply as a “segment.” 
Address translation is controlled by the setting of 
the relocate bits MSR dr for data and MSR !R for 
instructions. MSR, r and MSR dr are independent 
bits and may be set differently. The state of 
these bits may be changed by interrupts or by 
executing the appropriate instructions. An effec¬ 
tive address in these segments represents a real 
or virtual address depending on the setting of the 
relocate bits of the MSR. 

■ A direct-store segment, always referred to by the 
entire name “direct-store segment.” Such seg¬ 
ments may be used for access to I/O. Instruction 
fetch from direct-store segments is not allowed. 
MSR dr must be 1 when accessing data in a 
direct-store segment. See 12.6, “Direct-Store 
Segments” on page 173 for an explanation of 
direct-store segments. 

The value of the T bit in the Segment Table Entry or 
Segment Register distinguishes between ordinary 
storage segments and direct-store segments. 


T 

Segment type 

0 

Ordinary storage segment 

1 

Direct-store segment 


The T bit in the Segment Table Entry or Segment Reg¬ 
ister is ignored when fetching instructions with 
MSR, r = 0 or when accessing data with MSR dr = 0. 
Such accesses are not considered references to 
direct-store segments. 

See also section 12.6, “Direct-Store Segments” on 
page 173. 

12.2.2 Storage Exceptions 

Each Effective Address must be translated to real in 
order to complete the storage access. A storage 
exception occurs if this translation fails for one of the 
following reasons: 

64-bit implementations 

■ There is no valid entry in the Segment Table 
for the segment specified by the Effective 
Address. 

■ The appropriate Segment Table entry is 
found, but there is no valid entry in the Page 
Table for the page specified by the Effective 
Address. 

■ Both the appropriate Segment Table and 
Page Table entries are found, but the access 
is not allowed by the storage protection 
mechanism. 


32-bit implementations 

■ There is no valid entry in the Page Table for 
the page specified by the Effective Address. 

■ The appropriate Page Table entry is found 
but the access is not allowed by the storage 
protection mechanism. 

Storage exceptions cause Instruction Storage inter¬ 
rupts and Data Storage interrupts that identify the 
address of the failing instruction. 

In certain cases a storage exception may result in the 
“restart” of (re-execution of at least part of) a load or 
store instruction. See the section entitled “Instruction 
Restart” in Part 2, “PowerPC Virtual Environment 
Architecture” on page 117 

12.2.3 Instruction Fetch 

Instructions are fetched under control of MSR, r . 
When any context synchronizing event occurs, any 
prefetched instructions are discarded, and then 
refetched using the then-current state of MSR (R . 

msr [R =o 

When instruction relocation is off, MSR| R = 0, the 
effective address is interpreted as described in 
section 12.2.6, “Real Addressing Mode” on page 158. 

MSR iR =1 

Instructions are fetched using the address translated 
by one of the following mechanisms: 

1. Segmented Address Translation Mechanism 

2. Block Address Translation Mechanism 

Instruction fetch from direct-store segments is not 
supported. An attempt to execute an instruction 
fetched from a direct-store segment will result in an 
instruction Storage interrupt. 

12.2.4 Data Storage Access 

Data accesses are controlled by MSR dr . When the 
state of MSR dr changes, subsequent accesses are 
made using the new state of MSR dr . 

MSR dr =0 

When data relocation is off, MSR dr = 0, the effective 
address is interpreted as described in section 12.2.6, 
“Real Addressing Mode” on page 158. 

MSR dr =1 

When address relocation is on, MSR dr = 1, the effec¬ 
tive address is translated by one of the following 
mechanisms: 

1. Segmented Address Translation Mechanism 
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2. Block Address Translation Mechanism 

3. Direct-Store Segment Translation Mechanism 

12.2.5 Speculative Execution 

Data Access 

A speculative operation is one that a program 
“might” perform and that the hardware decides to 
execute out of order on the speculation that the result 
will be needed. If subsequent events indicate that the 
speculative instruction would not have been executed, 
the processor abandons any result the instruction 
produced. Typically, hardware executes instructions 
speculatively when it has resources that would other¬ 
wise be idle, so that the operation is done without 
cost or almost so. 

Most operations can be performed speculatively, as 
long as the machine appears to follow a simple 
sequential model such as presented in Part 1, 
“PowerPC User Instruction Set Architecture” on 
page 1. Certain speculative operations are not per¬ 
mitted: 

■ A speculative store may not be performed in such 
a manner that the alteration of the target location 
can be observed by other processors or mech¬ 
anisms until it can be determined that the store is 
no longer speculative. 

■ Speculative loads from “Guarded storage” (see 
below) are prohibited, except that if a load or 
store operation will be executed, the entire cache 
block(s) containing the referenced data may be 
loaded into the cache. 

■ No error of any kind other than Machine Check 
may be reported due to the speculative execution 
of an instruction, until such time as it is known 
that execution of the instruction is required. 

Speculative loads are allowed from any storage that 
is not “Guarded storage.” If in doing so a Machine 
Check exception results, a Machine Check interrupt 
may be generated even though the data access that 
caused the Machine Check exception would not have 
been performed because a previous uncompleted 
operation would have changed the execution path. 

Only one side effect (other than Machine Check) of 
speculative execution is permitted when a speculative 
instruction's result is abandoned: the Reference bit in 
a Page Table Entry may be set due to a speculative 
load. 

Instruction Prefetch 

The processor typically fetches instructions ahead of 
the one(s) currently being executed in order to avoid 
delay. Such instruction prefetching is a speculative 


operation in that prefetched instructions may not be 
executed due to intervening branches or interrupts. 

Most prefetching is permitted, as long as the machine 
appears to follow a simple sequential model such as 
presented in Part 1, “PowerPC User Instruction Set 
Architecture” on page 1. Certain prefetching is not 
permitted: 

■ Prefetching from “Guarded storage” (see below) 
is prohibited, except that if an instruction in a 
cache block will be executed, the entire cache 
block may be loaded. 

■ No error of any kind other than Machine Check 
may be reported due to instruction prefetching, 
until such time as the instruction that is the 
target of such prefetch becomes the instruction to 
be executed. 

Speculative instruction fetches are allowed from any 
storage that is not “Guarded storage.” If in doing so, 
a Machine Check exception results, a Machine Check 
interrupt may be generated even if the instruction 
fetch that caused the Machine Check exception would 
not have been executed because a previous uncom¬ 
pleted operation would have changed the execution 
path. 

Only one side effect (other than Machine Check) of 
instruction prefetching is permitted: the Reference bit 
in a Page Table Entry may be set. 

Guarded Storage 

Storage is said to be “Guarded” if either (a) the G bit 
is one in the relevant PTE or DBAT register, or (b) 
MSR bit IR or DR is zero for instruction fetches or 
data loads respectively. (In case (b) all Of storage is 
Guarded). 

Storage in a Guarded area may not be well-behaved 
with regard to prefetching and other speculative 
storage operations. Such storage may represent an 
I/O device, and a speculative load or instruction fetch 
directed to such a device may cause the device to 
perform unexpected or incorrect operations. 

Storage addresses in a Guarded area may not have 
successors; that is, there may be “holes” in a 
Guarded area of the real address space. On any 
system, the highest real address has no successor. 
Lack of a successor address means that speculative 
sequential operations such as instruction prefetching 
may fail and may result in a Machine Check. 


Load or Store Instruction 

A load or store instruction may not speculatively 
access Guarded storage unless one of the following 
conditions exist: 
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1. The target storage location is in a cache. In this 
case, the location may be accessed in the cache 
or in main storage. 

2. The target storage is Caching Allowed (1 = 0) and 
it is guaranteed that the load or store is on the 
branch path that will be executed (in the absence 
of any intervening interrupts). In this case, the 
entire cache block containing the target storage 
location may be loaded into the cache. 

3. The target storage is Caching Inhibited (1 = 1), the 
load or store is on the branch path that will be 
executed, and no prior instructions can cause an 
interrupt. 


Instruction Fetch 

Instructions may not be speculatively fetched from 
Guarded storage unless one of the following condi¬ 
tions exist: 

1. The target storage location is in a cache. In this 
case, the location may be accessed in the cache 
or in main storage. 

2. MSR(IR) = 1 and an instruction has been previ¬ 
ously fetched from the page. 

3. It is guaranteed that the instruction to be fetched 
is on the branch path that will be taken (in the 
absence of any intervening interrupts). If 
MSR ir = 0, only the block containing the target 
instruction may be fetched. 


12.2.6 Real Addressing Mode 

Whether address translation is enabled is controlled 
by MSR, r for instruction fetching and by MSR dr for 
data loads and stores, if address translation is disa¬ 
bled for a particular access (fetch, load, or store), the 
Effective Address is treated as the Real Address and 
is passed directly to the memory subsystem. 

The EA is a 64-bit {32-bit} quantity computed by the 
CPU. The width of the Real Address supported by a 
particular implementation will be less than or equal to 
this. If it is less, the high-order bits of the EA are 
ignored when forming the Real Address. 

Accesses in real mode bypass all storage protection 
checks (section 12.10) and do not cause the recording 
of reference and change information (section 12.9). 
Real mode data accesses are executed as though the 
storage access mode bits “WIMG” were 0011 (section 
12.8). This mode allows accesses to be cached, does 
not require the accesses to be written through the 
cache to main storage, requires the hardware to 
enforce data consistence with storage, I/O, and other 
processors (caches), and treats all storage as 
Guarded storage. Real mode instruction fetches are 
executed as though the “WIMG” bits were either 0001 
or 0011. Speculative fetching of instructions and 
speculative loads from storage in real mode are pro¬ 
hibited (see “Guarded Storage” above). 

Access to direct-store segments (section 12.6) is not 
possible when translation is disabled, as Segment 
Table Entries (section 12.4.1.2) or Segment Registers 
(section 12.5.1.1) are not checked for a T=1 specifica¬ 
tion. 

WARNING: An attempt to fetch from, load from, or 
store to a Real Address that is not physically present 
in the machine may result in a Machine Check inter¬ 
rupt or a Checkstop (Section 13.5.2). 
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12.3 Address Translation Overview 


Figure 49 gives an overview of the address translation process on PowerPC. 



Effective Address 









Segmented Address 
Translation 


Lookup in 
Segment Table 


Ordinary 

Segment 


Direct-Store 

Segment 


Block Address 
Translation 


Match against 
BAT Registers 


Virtual Address 
Translation 


Lookup in 
Page Table 


Real Address 


| I/O Address 


Real Address 


Figure 49. PowerPC Address Translation 


The Effective Address (EA) is the address generated 
by the processor for load and store instructions or for 
instruction fetch. This address is passed simultane¬ 
ously to two translation mechanisms: 

■ Segmented Address Translation, described in 
section 12.4 on page 160 for 64-bit implementa¬ 
tions, and in section 12.5 on page 168 for 32-bit 
implementations, and 

■ Block Address Translation, described in section 
12.7 on page 174. 

A typical Effective Address will be successfully trans¬ 
lated by just one of these mechanisms. If neither 
mechanism is successful, a storage exception (page 
156) results. If both mechanisms are successful. 
Block Address Translation takes precedence. 


An Effective Address that translates successfully via 
the Segmented Address Translation mechanism (but 
not by the Block Address Translation mechanism) is a 
reference to one of two types of segments: 

■ A direct-store segment, in which case the address 
is converted directly into an I/O address and is 
passed to the I/O subsystem for further action, or 

■ An ordinary segment, in which case the address 
is converted into a real address that is then used 
to access storage. 

An Effective Address that translates successfully via 
the Block Address Translation mechanism is con¬ 
verted directly into a real address that is then used to 
access storage. 
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12.4 Segmented Address Translation, 64-bit Implementations 


Figure 50 shows the steps involved in translating from an Effective Address to a Real Address on a 64-bit imple¬ 
mentation. 



- 36 -_ 

-16-i 

-12—| 

64-bit EA 

Effective Segment 10 

Page 

Byte 


i___i 1 ___i i___l 


Lookup 


Segment Table 



-52- 

- 16 ~] 

-12—] 

80-bit VA 

Virtual Segment ID 

Page 

Byte 


I___1 I_,_I 


Lookup 


Page Table 


-52—i 

-12—| 

Real Page Number 

Byte 


Figure 50. Address Translation Overview (64-bit implementations) 

If an access is translated by the Block Address Trans¬ 
lation mechanism (BAT, see Section 12.7 on page 
174), the BAT translation takes precedence and the 
results of segmented address translation are not 
used. If an access is not translated by a BAT, seg¬ 
mented address translation proceeds as follows. 

The Effective Address (EA) is a 64-bit quantity com¬ 
puted by the processor. Bits 0:35 of the EA are the 
Effective Segment ID (ESID); these are looked up in 
the Segment Table to produce a Virtual Segment ID 
(VSID). Bits 36:51 of the EA are the Page Number 
within the segment; these are concatenated with the 
VSID from the Segment Table to form the Virtual Page 
Number (VPN). The VPN is looked up in the Page 
Table to produce a Real Page Number (RPN). Bits 
52:63 of the EA are the Byte Offset within the page; 
these are concatenated with the RPN to form the Real 
Address (RA) that is used to access storage. 

If the processor is executing in 32-bit mode 
(MSR sf = 0), the translation process described above 
is followed except that the high-order 32 bits of the 
64-bit Effective Address (that is, bits 0:31 of the ESID) 


are forced to zero before the lookup in the Segment 
Table starts. Bits 32:35 of the EA, which are the high- 
order 4 bits of the lower 32 bits of the EA, thus consti¬ 
tute the ESID. 

If the selected Segment Table Entry identifies the 
segment as a direct-store segment, the Page Table is 
not referred to. Rather, translation continues as 
described in 12.6, “Direct-Store Segments” on 
page 173. 

For ordinary storage segments the translation moves 
in two steps from Effective Address to Virtual Address 
(which never exists as a specific entity but can be 
considered to be the concatenation of the VPN and 
Byte Offset), and from Virtual Address to Real 
Address. 

The first step in segmented address translation is to 
convert the effective address into a virtual address, 
described in section 12.4.1 on page 161. The second 
step, conversion of the virtual address into a real 
address, is described in section 12.4.2 on page 164. 
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12.4.1 Virtual Address Generation, 64-bit Implementations 


Conversion of a 64-bit Effective Address to a Virtual Address is done by searching a hashed segment table 
pointed to by the Address Space Register. 


Address Space Register (ASR) 


Real Address of Segment Table 


| Hash Function 


SEGMENT TABLE ENTRY (STE) 
16 bytes 


| HI j V ] T | Ks | Xp | III || 


56 57 58 59 63 
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-16-j-12-i 

Page | Byte | 


31 35 36 51 52 63 


; * 

-52—|-16-]-12~i 

i Page j Byte | 


Figure 51. Translation of 64-bit Effective Address to Virtual Address 


-Virtual Page Number (VPN)- 


BS-BIT VIRTUAL ADDRESS 


12.4.1.1 Address Space Register 

The ASR is shown in Figure 52. This 64-bit special- 
purpose register holds the real address of the 
Segment Table. The Segment Table defines the set of 
segments than can be addressed at any one time; it is 
usual to have different segment tables for different 
processes. The contents of the ASR are usually part 
of the process state. 

Access to the ASR is privileged. The ASR may be 
read or written by the mfspr and mtspr instructions. 
See “Move From Special Purpose Register 


XFX-form” on page 79 and “Move To Special Purpose 
Register XFX-form” on page 79. 

Real address of Segment Table III 

0 51 63 

Figure 52. Address Space Register 

- Programming Note - 

The values 0, 0x1000, and 0x2000 cannot be used 
as Segment Table addresses, since these pages 
contain interrupt vectors. 
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T=0 



Dword Bit 

Name 

Description 

Dword Bit 

Name 

Description 

0 0:35 

ESID 

Effective Segment ID 

1 0:51 

VSID 

Virtual SID 

56 

V 

Entry valid if V=1 




57 

T 

Direct-store segment if T= 1 

0:63 

IO 

I/O specific 

58 

K s 

Supervisor state storage key 




59 

K P 

Problem state storage key 





All other fields are reserved. 


Figure 53. Segment Table Entry format 

12.4.1.2 Segment Table 

The Segment Table (STAB) is a one-page data struc¬ 
ture that defines the mapping between Effective 
Segment IDs and Virtual Segment IDs. The STAB 
must be on a page boundary. 

The STAB contains 32 Segment Table Entry Groups, 
or STEGs. A STEG contains 8 Segment Table Entries 
(STEs) of 16 bytes each; each STEG is thus 128 bytes 
long. STEGs are entry points for searches of the 
Segment Table. 

See section 12.12, “Table Update Synchronization 
Requirements” on page 186 for the rules that soft¬ 
ware must follow when updating the Segment Table. 

Segment Table Entry 

Each Segment Table Entry (STE) maps one ESID to 
one VSID. Additional information in the STE controls 
the STAB search process and provides input to the 
storage protection mechanism. Figure 53 shows the 
layout of an STE. 

See 12.10, “Storage Protection” on page 179 for a 
discussion of the storage key bits. 

12.4.1.3 Segment Table Search 

An outline of the STAB search process is shown in 
Figure 51 on page 161. The detailed algorithm is as 
follows: 

1. Primary Hash: Bits 0:51 of the ASR are concat¬ 
enated with bits 31:35 of the Effective Address 
(the low 5 bits of the ESID) and with a field of 
seven Os to form the 64-bit real address of a 
Segment Table Entry Group. This operation is 


referred to as the “Primary STAB Hash.” This 
identifies a particular STEG, each of whose 8 
STEs will be tested in turn. 

2. The first STE in the selected STEG is tested for a 
match with the EA. In order for a match to exist, 
the following must be true: 

■ STE V = 1 

■ STE ES1D = EAq-35 

If a match is found, the STE search terminates 
successfully 

3. Step 2 is repeated for each of the other 7 STEs in 
the STEG. The first matching STE terminates the 
search. If none of the 8 STEs match, the sec¬ 
ondary hash must be tried. 

4. Secondary Hash: Bits 0:51 of the ASR are con¬ 
catenated with the ones-complement of bits 31:35 
of the Effective Address and with a field of seven 
Os to form the 64-bit real address of a Segment 
Table Entry Group. This operation is referred to 
as the “Secondary STAB Hash.” 

5. The first STE in the selected STEG is tested for a 
match with the EA. In order for a match to exist, 
the following must be true: 

• STE V - 1 

■ STE esid = EAq-35 

If a match is found, the STE search terminates 
successfully. 

6. Step 5 is repeated for each of the other 7 STEs in 
the STEG. The first matching STE terminates the 
search. If none of the 8 STEs match, the search 
fails. 

If the Segment Table search succeeds, the Virtual 
Page Number (VPN) is formed by concatenating the 
VSID from the matching STE with bits 36:51 of the 
Effective Address (the page number). The complete 
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80-bit Virtual Address (VA) is formed by concatenating 
the VPN with bits 52:63 of the EA (the byte offset). 

If the search fails, a page fault interrupt is taken. This 
will be an Instruction Storage interrupt or a Data 
Storage interrupt, depending on whether the Effective 
Address is for an instruction fetch or for data access. 

If the selected STE has T=1, the reference is to a 
direct-store segment. No reference is made to the 
Page Table; processing continues as described in 
12.6, “Direct-Store Segments” on page 173. 


Segment Lookaside Buffer 

Conceptually, the segment table is searched by the 
address relocation hardware to translate every refer¬ 
ence. For performance reasons the hardware usually 
keeps a Segment Lookaside Buffer (SLB) that holds 
STEs that have recently been used. The SLB is 
searched prior to searching the Segment Table. As a 
consequence, when software makes changes to the 
Segment Table it must perform the appropriate SLB 
invalidate operations to maintain the consistency of 
the SLB with the tables. 

- Programming Notes - 

1. Segment table entries may or may not be 
cached in an SLB. 

2. Table lookups are done using real addresses 
and storage access mode M = 1 (Memory 
Coherence). 

3. If software plans to access the STAB with 
data relocate on, MSR dr = 1, it must avoid 
cache synonyms by mapping these tables 
such that the real and virtual address bits 
used for cache set selection are the same, 
just as is required for other virtual accesses. 
See address alignment requirements 
described in Part 2, “PowerPC Virtual Envi¬ 
ronment Architecture” on page 117. 

4. It is possible that the hardware implements 
two SLB arrays (one for data and one for 
instruction). In this case the size, shape and 
values contained by the arrays may be dif¬ 
ferent. 

5. The ASR must point to a valid Segment Table 
whenever address relocation is enabled 
(MSR, r = 1 or MSR dr = 1 or both) and the 
Effective Address is not covered by BAT 
translation. 

6. Use the stbie or slbia instruction to ensure 
that the SLB no longer contains a mapping for 
a particular segment. 

7. See Appendix L, “Synchronization Require¬ 
ments for Special Registers" on page 269, for 
the synchronization requirements that must 
be satisfied when a program changes the con¬ 
tents of the ASR. 

8. Hardware never modifies the Segment Table. 


12.4.1.4 32-bit Execution Mode 

When a 64-bit implementation executes in 32-bit mode 
(MSR sf = 0), the Segment Table search is modified as 
follows: 

1. The 64-bit Effective Address is computed by the 
processor as usual. 

2. The high-order 32 bits of the EA are forced to 
zero. Thus the Effective Segment ID consists of 
32 0's concatenated with the high-order 4 bits of 
the lower half of the 64-bit EA. 

3. The modified EA is then used as input to the 
Segment Table search. 

The zeroing of the high-order 32 bits effectively trun¬ 
cates the 64-bit EA to a 32-bit EA such as would have 
been generated on a 32-bit implementation. The ESID 
in 32-bit mode is the high-order 4 bits of this trun¬ 
cated EA; the ESID thus lies in the range 0:15. These 
4 bits would select a Segment Register on a 32-bit 
implementation; they select one of 16 STEGs in the 
Segment Table on a 64-bit implementation. These 
STEGs can be used to emulate the 32-bit machine's 
Segment Registers. 

This truncation of the EA is the sole effect of 32-bit 
mode (MSR sf = 0) on address translation; everything 
else proceeds as for 64-bit mode. 
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12.4.2 Virtual to Real Translation, 64-bit Implementations 

Conversion of an 80-bit Virtual Address to a Real Address is done by searching a hashed page table located by 
SDR1. 
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Figure 54. Translation of 80-bit Virtual Address to 64-bit Real Address 


Generation of the 80-bit Virtual Address that is input 12.4.1, “Virtual Address Generation, 64-bit 

to this stage of the translation process is described in Implementations” on page 161. 
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12.4.2.1 Page Table 

The Hashed Page Table (HTAB) is a variable-sized 
data structure that defines the mapping between 
Virtual Page Numbers and Real Page Numbers. The 
HTAB's size must be a power of 2, and its starting 
address must be a multiple of its size. 

The layout of the HTAB is similar to that of the 
Segment Table, except that the HTAB's size is vari¬ 
able while the STAB's size is exactly one page. The 
HTAB contains a number of Page Table Entry Groups, 
or PTEGs. A PTEG contains 8 Page Table Entries 
(PTEs) of 16 bytes each; each PTEG is thus 128 bytes 
long. PTEGs are entry points for searches of the Page 
Table. 

See section 12.12, “Table Update Synchronization 
Requirements” on page 186 for the rules that soft¬ 
ware must follow when updating the Page Table. 

Page Table Entry 


in the Page Table and thus the rate of page fault 
interrupts. If the table is too small, it is possible that 
not all the virtual pages that actually have real page 
frames assigned can be mapped via the Page Table. 
This can happen if too many hash collisions occur and 
there are more than 16 entries for the same 
primary/secondary pair of PTEGs. While this situation 
cannot be guaranteed not to occur for any size Page 
Table, making the Page Table larger than the 
minimum size will reduce the frequency of occurrence 
of such collisions. 


- Programming Note - 

It is recommended that the number of PTEGs in 
the Page Table be at least one-half the number of 
real pages to be accessed. 

As an example, if the amount of real memory to 
be accessed is 2 31 bytes (2 GB), then we have 
2 31-12 = 2 19 real pages. The minimum recom¬ 
mended Page Table size would be 2 18 PTEGs, or 
22 s bytes (32 MB). 


Each Page Table Entry (PTE) maps one VPN to one 
RPN. Additional information in the PTE controls the 
HTAB search process and provides input to the 
storage protection mechanism. Figure 55 shows the 
layout of a PTE. 
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12.4.2.2 Storage Description Register 1 

The SDR1 register is shown in Figure 56. 


HTABORG 


// HTABSIZE 


45 


58 


63 


Bits Name Description 
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Real Page Number 
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Reference bit 


56 

C 

Change bit 
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Storage access controls 
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PP 

Page protection bits 


All other fields are reserved. 


Figure 55. Page Table Entry, 64-bit implementations 

The PTE contains an Abbreviated Page Index rather 
than the complete Page field. At least 11 of the low- 
order bits of the VPN are used in the hash function to 
select a PTEG. These bits are not repeated in the 
PTEs of that PTEG. 

Page Table Size 

The number of entries in the Page Table directly 
affects performance because it influences the hit ratio 


All other fields are reserved. 

Figure 56. SDR1, 64-bit implementations 

The HTABORG field in SDR1 contains the high-order 
46 bits of the 64-bit real address of the page table. 
The Page Table is thus constrained to lie on a 2 18 byte 
(256 KB) boundary at a minimum. At least 11 bits 
from the hash function (Figure 54 on page 164) are 
used to index into the Page Table. The minimum size 
Page Table is 256 KB (2 11 PTEGs of 128 bytes each). 

The Page Table can be any size 2 n where 18 < n < 46. 
As the table size is increased, more bits are used 
from the hash to index into the table and the value in 
HTABORG must have more of its low-order bits equal 
to 0. The HTABSIZE field in SDR1 contains an integer 
giving the number of bits from the hash that are used 
in the Page Table index. HTABSIZE is used to gen¬ 
erate a mask of the form 0b00...011...1, that is, a 
string of 0 bits followed by a string of 1 bits. The 1 
bits determine which additional bits (beyond the 
minimum of 11) from the hash are used in the index; 
HTABORG must have this same number of low-order 
bits equal to 0. See Figure 54 on page 164. 

Example 
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Suppose that the Page Table is 16,384 (2 14 ) 128-byte 
PTEGs, for a total size of 2 21 bytes (2 MB). A 14-bit 
index is required. Eleven bits are provided from the 
hash to start with, so 3 additional bits from the hash 
must be selected. Thus the value in HTABSI2E must 
be 3 and the value in HTABORG must have its low- 
order 3 bits (bits 31:33 of SDR1) equal to 0. This 
means that the Page Table must begin on a 
2 s + ii + 7 = 2 2i = 2 MB boundary. 

12.4.2.3 Hashed Page Table Search 

An outline of the HTAB search process is shown in 
Figure 54 on page 164. The detailed algorithm is as 
follows: 

1. Primary Hash: A 39-bit hash value is computed 
by Exclusive-ORing the low-order 39 bits of the 
VSID with a 39-bit value formed by concatenating 
23 bits of 0 with the Page index. 

2. The 64-bit real address of a PTEG is formed by 
concatenating the following values: 

■ Bits 0:17 of SDR1 (the 18 high-order bits of 
HTABORG). 

■ Bits 0:27 of the value formed in step 1 ANDed 
with the mask generated from bits 58:63 of 
SDR1 (HTABSIZE) and then ORed with bits 
18:45 of SDR1 (the 28 low-order bits of 
HTABORG). 

■ Bits 28:38 of the value formed in step 1. 

■ A 7-bit field of Os. 

This operation is referred to as the “Primary 
HTAB Hash.” This identifies a particular PTEG, 
each of whose 8 PTEs will be tested in turn. 

3. The first PTE in the selected PTEG is tested for a 
match with VPN. In order for a match to exist, 
the following must be true: 

■ PTE h = 0 

- PTE V - 1 

■ PTEvsi D = VA 0;51 

■ PTE ap , = VA 5 2 5 6 

If a match is found, the PTE search terminates 
successfully. 

4. Step 3 is repeated for each of the other 7 PTEs in 
the PTEG. The first matching PTE terminates the 
search. If none of the 8 PTEs match, the sec¬ 
ondary hash must be tried. 

5. Secondary Hash: A 39-bit hash value is com¬ 
puted by taking the ones complement of the 
Exclusive OR of the low-order 39 bits of the VSID 


with a 39-bit value formed by concatenating 23 
bits of 0 with the Page index. 

6. The 64-bit real address of a PTEG is formed by 
concatenating the following values: 

■ Bits 0:17 of SDR1 (the 18 high-order bits of 
HTABORG). 

■ Bits 0:27 of the value formed in step 5 ANDed 
with the mask generated from bits 58:63 of 
SDR1 (HTABSIZE) and then ORed with bits 
18:45 of SDR1 (the 28 low-order bits of 
HTABORG). 

■ Bits 28:38 of the value formed in step 5. 

■ A 7-bit field of 0s. 

This operation is referred to as the “Secondary 
HTAB Hash.” 

7. The first PTE in the selected PTEG is tested for a 
match with VPN. In order for a match to exist, 
the following must be true: 

- PTE H = 1 

■ PTE V = 1 

■ PTEvsiq = VA 0;51 

■ PTEap, = VA 52:56 

If a match is found, the PTE search terminates 
successfully. 

8. Step 7 is repeated for each of the other 7 PTEs in 
the PTEG. The first matching PTE terminates the 
search. If none of the 8 PTEs match, the search 
fails. 

If the Page Table search succeeds, the content of the 
PTE that translates the EA is returned. The Real 
Address (RA) is formed by concatenating the RPN 
from the matching PTE with bits 52:63 of the Effective 
Address (the byte offset). 

If the search fails, a page fault interrupt is taken. This 
will be an Instruction Storage interrupt or a Data 
Storage interrupt, depending on whether the Effective 
Address is for an instruction fetch or for data access. 

Translation Lookaside Buffer 

Conceptually, the Page Table is searched by the 
address relocation hardware to translate every refer¬ 
ence. For performance reasons the hardware usually 
keeps a Translation Lookaside Buffer (TLB) that holds 
PTEs that have recently been used. The TLB is 
searched prior to searching the Page Table. As a 
consequence, when software makes changes to the 
Page Table it must perform the appropriate TLB inval¬ 
idate operations to maintain the consistency of the 
TLB with the Page Table. 
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- Programming Notes - 

1. Page table entries may or may not be cached 
in a TLB. 

2. Table lookups are done using real addresses 
and storage access mode M = 1 (Memory 
Coherence). 

3. If software plans to access the HTAB with 
data relocate on, MSR dr = 1, it must avoid 
cache synonyms by mapping these tables 
such that the real and virtual address bits 
used for cache set selection are the same, 
just as is required for other virtual accesses. 
See address alignment requirements 
described in Part 2, “PowerPC Virtual Envi¬ 
ronment Architecture” on page 117. 

4. It is possible that the hardware implements 
two TLB arrays (one for data and one for 
instruction). In this case the size, shape and 
values contained by the arrays may be dif¬ 
ferent. 

5. Use the tlbie or tibia instruction to ensure 
that the TLB no longer contains a mapping for 
a particular page. 

6. Refer to Book IV, PowerPC implementation 
Features for the procedure to be used to 
invalidate the entire TLB. 




12.5 Segmented Address Translation, 32-bit Implementations 


Figure 57 shows the steps involved in translating from an effective address to a real address on a 32-bit imple¬ 
mentation. 
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Figure 57. Address Translation Overview (32-bit implementations) 

If an access is translated by the Block Address Trans¬ 
lation mechanism (BAT, see Section 12.7 on page 
174), the BAT translation takes precedence and the 
results of segmented address translation are not 
used. If an access is not translated by a BAT, seg¬ 
mented address translation proceeds as follows. 

The Effective Address (EA) is a 32-bit quantity com¬ 
puted by the processor. Bits 0:3 of the EA are the 
Segment Register number. These are used to select 
a Segment Register, from which is extracted a Virtual 
Segment ID (VSID). Bits 4:19 of the EA are the Page 
Number within the segment; these are concatenated 
with the VSID from the Segment Register to form the 
Virtual Page Number (VPN). The VPN is looked up in 
the Page Table to produce a Real Page Number (RPN). 
Bits 20:31 of the EA are the Byte Offset within the 
page; these are concatenated with the RPN to form 
the Real Address (RA) that is used to access storage. 


If the selected Segment Register identifies the 
segment as a direct-store segment, the Page Table is 
not referred to. Rather, translation continues as 
described in 12.6, “Direct-Store Segments” on 
page 173. 

For ordinary storage segments the translation moves 
in two steps from Effective Address to Virtual Address 
(which never exists as a specific entity but can be 
considered to be the concatenation of the VPN and 
Byte Offset), and from Virtual Address to Real 
Address. 

The first step in segmented address translation is to 
convert the effective address into a virtual address, 
described in section 12.5.1 on page 169. The second 
step, conversion of the virtual address into a real 
address, is described in section 12.5.2 on page 170. 
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12.5.1 Virtual Address Generation, 
32-bit Implementations 


Conversion of a 32-bit Effective Address to a Virtual 
Address is done by using the 4 high-order bits of the 
EA to select a Segment Register. 
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Figure 58. Translation of 32-bit Effective Address to Virtual Address 


12.5.1.1 Segment Registers 

The 16 32-bit registers are present only in 32-bit 
implementations of PowerPC. Figure 59 shows the 
layout of a Segment Register. The fields in the 
Segment Register are interpreted differently 
depending on the value of bit 0 (the T bit). 
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Figure 59. Segment Register format 

If an access is translated by the Block Address Trans¬ 
lation mechanism (BAT, see Section 12.7 on page 
174), the BAT translation takes precedence and the 
results of translation using Segment Registers are not 
used. If an access is not translated by a BAT, and 
T=0 in the selected Segment Register, the Effective 
Address is a reference to an ordinary storage 
segment. The 52-bit Virtual Address (VA) is formed 
by concatenating 

■ the 24-bit VSID field from the Segment Register. 

■ the 16-bit page index, EA 4:19 , and 

■ the 12-bit byte offset, EA 2 o; 3 i- 

The VA is then translated to a Real Address as 
described in the next section. 

If T=1 in the selected Segment Register (and the 
access is not translated by a BAT), the Effective 
Address is a reference to a direct-store segment. No 
reference is made to the page table; processing con¬ 
tinues as in 12.6, “Direct-Store Segments” on 
page 173. 


Name Description 

T T= 1 selects this format 

K s Supervisor state storage key 

K p Problem state storage key 

BUID Bus Unit ID 

Device dependent data for 
I/O controller 


Chapter 12. Storage Control 169 











12.5.2 Virtual to Real Translation, 32-bit Implementations 

Conversion of a 52-bit Virtual Address to a Real Address is done by searching a hashed page table located by 
SDR1. 
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Figure 60. Translation of 52-bit Virtual Address to 32-bit Real Address 


Generation of the 52-bit Virtual Address that is input 
to this stage of the translation process is described in 


12.5.1, “Virtual Address Generation, 32-bit 
Implementations” on page 169. 
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12.5.2.1 Page Table 

The Hashed Page Table (HTAB) is a variable-sized 
data structure that defines the mapping between 
Virtual Page Numbers and Real Page Numbers. The 
HTAB's size must be a power of 2, and its starting 
address must be a multiple of its size. 


frames assigned can be mapped via the Page Table. 
This can happen if too many hash collisions occur and 
there are more than 16 entries for the same 
primary/secondary pair of PTEGs. While this situation 
cannot be guaranteed not to occur for any size Page 
Table, making the Page Table larger than the 
minimum size will reduce the frequency of occurrence 
of such collisions. 


The HTAB contains a number of Page Table Entry 
Groups, or PTEGs. A PTEG contains 8 Page Table 
Entries (PTEs) of 8 bytes each; each PTEG is thus 64 
bytes long. PTEGs are entry points for searches of 
the Page Table. 

See section 12.12, “Table Update Synchronization 
Requirements” on page 186 for the rules that soft¬ 
ware must follow when updating the Page Table. 

Page Table Entry 


- Programming Note - 

It is recommended that the number of PTEGs in 
the Page Table be at least one-half the number of 
real pages to be accessed. 

As an example, if the amount of real memory to 
be accessed is 2^ bytes (512 MB), then we have 
229 -12 _ 217 rea | pages. The minimum recom¬ 
mended Page Table size would be 2 16 PTEGs, or 
2 22 bytes (4 MB). 


Each Page Table Entry (PTE) maps one VPN to one 

rpn. Additional information in the pte controls the 12.5.2.2 Storage Description Register 1 
HTAB search process and provides input to the 

storage protection mechanism. Figure 61 shows the The SDR1 register is shown in Figure 62. 
layout of a PTE. 
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Figure 61. Page Table Entry, 32-bit implementations 

The PTE contains an Abbreviated Page Index rather 
than the complete Page field. At least 10 of the low- 
order bits of the Page are used in the hash function to 
select a PTEG. These bits are not repeated in the 
PTEs of that PTEG. 

Page Table Size 

The number of entries in the Page Table directly 
affects performance because it influences the hit ratio 
in the Page Table and thus the rate of page fault 
interrupts. If the table is too small, it is possible that 
not all the virtual pages that actually have real page 


All other fields are reserved. 

Figure 62. SDR1, 32-bit implementations 

The HTABORG field in SDR1 contains the high-order 
16 bits of the 32-bit real address of the page table. 
The Page Table is thus constrained to lie on a 2 16 byte 
(64 KB) boundary at a minimum. At least 10 bits from 
the hash function (Figure 60 on page 170) are used 
to index into the Page Table. The minimum size Page 
Table is 64 KB (2 10 PTEGs of 64 bytes each). 

The Page Table can be any size 2 n where 16 < n < 25. 
As the table size is increased, more bits are used 
from the hash to index into the table and the value in 
HTABORG must have more of its low-order bits equal 
to 0. The HTABMASK field in SDR1 contains a mask 
value that determines how many bits from the hash 
are used in the Page Table index. This mask must be 
of the form 0b00...011...1, that is, a string of 0 bits fol¬ 
lowed by a string of 1 bits. The 1 bits determine how 
many additional bits (beyond the minimum of 10) from 
the hash are used in the index; HTABORG must have 
this same number of low-order bits equal to 0. See 
Figure 60 on page 170. 

Example 

Suppose that the Page Table is 8,192 (2 13 ) 64-byte 
PTEGs, for a total size of 2 19 bytes (512 KB). A 13-bit 
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index is required. Ten bits are provided from the 
hash to start with, so 3 additional bits from the hash 
must be selected. Thus the value in HTABMASK 
must be 0x007 and the value in HTABORG must have 
its low-order 3 bits (bits 13:15 of SDR1) equal to 0. 
This means that the Page Table must begin on a 
2 s + io + 6 _ 2 i9 = 512 KB boundary. 

12.5.2.3 Hashed Page Table Search 

An outline of the HTAB search process is shown in 
Figure 60 on page 170. The detailed algorithm is as 
follows: 

1. A 19-bit hash value is computed by 
Exclusive-ORing the low-order 19 bits of the VSID 
with a 19-bit value formed by concatenating 3 bits 
of 0 with the Page index. 

2. Primary Hash: The 32-bit real address of a PTEG 
is formed by concatenating the following values: 

■ Bits 0:6 of SDR1 (the 7 high-order bits of 
HTABORG). 

■ Bits 0:8 of the value formed in step 1 ANDed 
with bits 23:31 of SDR1 (the value of 
HTABMASK) and then ORed with bits 7:15 of 
SDR1 (the 9 low-order bits of HTABORG). 

■ Bits 9:18 of the value formed in step 1. 

■ A 6-bit field of 0s. 

This operation is referred to as the “Primary 
HTAB Hash.” This identifies a particular PTEG, 
each of whose 8 PTEs will be tested in turn. 

3. The first PTE in the selected PTEG is tested for a 
match with VPN. In order for a match to exist, 
the following must be true: 

■ PTE h = 0 

■ PTE V = 1 

■ PTEvsid = VA 0; 23 

■ PTEap, = VA 2 4 29 

If a match is found, the PTE search terminates 
successfully. 

4. Step 3 is repeated for each of the other 7 PTEs in 
the PTEG. The first matching PTE terminates the 
search. If none of the 8 PTEs match, the sec¬ 
ondary hash must be tried. 

5. A 19-bit hash value is computed by taking the 
ones complement of the Exclusive OR of the low- 
order 19 bits of the VSID with a 19-bit value 
formed by concatenating 3 bits of 0 with the Page 
index. 


6. Secondary Hash: The 32-bit real address of a 
PTEG is formed by concatenating the following 
values: 

■ Bits 0:6 of SDR1 (the 7 high-order bits of 
HTABORG). 

■ Bits 0:8 of the value formed in step 5 ANDed 
with bits 23:31 of SDR1 (the value of 
HTABMASK) and then ORed with bits 7:15 of 
SDR1 (the 9 low-order bits of HTABORG). 

■ Bits 9:18 of the value formed in step 5. 

■ A 6-bit field of 0s. 

This operation is referred to as the “Secondary 
HTAB Hash.” 

7. The first PTE in the selected PTEG is tested for a 
match with VPN. In order for a match to exist, 
the following must be true: 

- PTE h - 1 

■ PTE V = 1 

■ PTEvsiq = VA 0:23 

■ PTE ap , = VA 24;29 

If a match is found, the PTE search terminates 
successfully. 

8. Step 7 is repeated for each of the other 7 PTEs in 
the PTEG. The first matching PTE terminates the 
search. If none of the 8 PTEs match, the search 
fails. 

If the Page Table search succeeds, the content of the 
PTE that translates the EA is returned. The Real 
Address (RA) is formed by concatenating the RPN 
from the matching PTE with bits 20:31 of the Effective 
Address (the byte offset). 

If the search fails, a page fault interrupt is taken. This 
will be an Instruction Storage interrupt or a Data 
Storage interrupt, depending on whether the Effective 
Address is for an instruction fetch or for data access. 

Translation Lookaside Buffer 

Conceptually, the Page Table is searched by the 
address relocation hardware to translate every refer¬ 
ence. For performance reasons the hardware usually 
keeps a Translation Lookaside Buffer (TLB) that holds 
PTEs that have recently been used. The TLB is 
searched prior to searching the Page Table. As a 
consequence, when software makes changes to the 
Page Table it must perform the appropriate TLB inval¬ 
idate operations to maintain the consistency of the 
TLB with the Page Table. 
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— Programming Notes - 

1. Page table entries may or may not be cached 
in a TLB. 

2. Table lookups are done using real addresses 
and storage access mode M = 1 (Memory 
Coherence). 

3. If software plans to access the HTAB with 
data relocate on, MSR dr = 1, it must avoid 
cache synonyms by mapping these tables 
such that the real and virtual address bits 
used for cache set selection are the same, 
just as is required for other virtual accesses. 
See address alignment requirements 
described in Part 2, “PowerPC Virtual Envi¬ 
ronment Architecture” on page 117. 

4. It is possible that the hardware implements 
two TLB arrays (one for data and one for 
instruction). In this case the size, shape and 
values contained by the arrays may be dif¬ 
ferent. 

5. Use the tlbie or tibia instruction to ensure 
that the TLB no longer contains a mapping for 
a particular page. 

6 . Refer to Book IV, PowerPC Implementation 
Features for the procedure to be used to 
invalidate the entire TLB. 


12.6 Direct-Store Segments 

A direct-store segment is a mapping of effective 
addresses onto an external address space, typically 
an I/O bus. 

Effective addresses that lie within direct-store seg¬ 
ments complete only the first step of the ordinary 
segmented address translation. 

■ In 64-bit implementations, this is the search of the 
Segment Table. If the resulting Segment Table 
Entry has T— 1, the reference is to a direct-store 
segment. 

■ In 32-bit implementations, this is the selection of 
the Segment Register. If the SR has T=1, the 
reference is to a direct-store segment. 

Direct-store data accesses are executed as though 
the storage access mode bits “WIMG” were xIOI (see 
Section 12.8). This mode requires bypassing the 
cache, does not require the hardware to enforce data 
coherence with storage, I/O, and other processors 
(caches), and treats the segment as Guarded storage. 


12.6.1 Completion of direct-store 
access 

If an access is translated by the Block Address Trans¬ 
lation mechanism (BAT, Section 12.7), the BAT trans¬ 
lation takes precedence and the results of segmented 
address translation are not used. If an access is not 
translated by a BAT, and the segmented address 
translation process has discovered that the segment 
has T=1, translation terminates. No reference is 
made to the Page Table; Reference and Change bits 
are not updated. The following data is sent to the 
storage controller: 

For 64-bit implementations: 

■ A one bit field representing the privilege of 
the storage access, computed as follows: 

Key <- (Kp & MSR pr ) I (K s & -MSR pr ) 

■ The 32-bit IO field from bits 32:63 of the 
second doubleword of the STE 

■ The low-order 28 bits of the Effective 

Address, EA 36;63 

For 32-bit implementations: 

■ A one bit field representing the privilege of 
the storage access, computed as follows: 

Key <- (Kp & MSR pr ) 1 (K s & -MSR pr ) 

* The contents of bits 3:31 of the Segment 
Register, which is the BUID field concat¬ 
enated with the “controller specific” field. 

■ The low-order 28 bits of the Effective 

Address, EA 4:31 

An implementation of PowerPC Architecture may 
cause multiple address/data transfers for a single 
instruction. The address for each transfer will be 
handled in the same manner that addresses for 
access to main store are handled. 

12.6.2 Direct-store segment 
protection 

Page-level protection as described in 12.10.1, “Page 
Protection” on page 179 is not provided by the 
PowerPC processor for direct-store segments. The 
appropriate key bit (K s or K p ) from the STE or SR is 
sent to the storage controller, but it is up to the 
storage controller to implement any protection mech¬ 
anism. Frequently no such mechanism will be pro¬ 
vided; the fact that a direct-store segment is mapped 
into the address space of a process may be regarded 
as sufficient authority to access the segment. 
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12.6.3 Instructions not supported for 
T = 1 

The following instructions are not supported when 
issued with an Effective Address in a segment where 
T = 1: 

• Iwarx • stwcx. 

• Idarx • stdcx. 

• eciwx • ecowx 

If one of these instructions is executed with an effec¬ 
tive address in a segment with T=1, a Data Storage 
interrupt may occur or the results may be boundedly 
undefined. 


12.6.4 Instructions with no effect for 
T = 1 

The following instructions are treated as no-ops when 
issued with an Effective Address in a segment where 
T= 1: 


debt 

• debst 

debtst 

• debz 

debf 

• iebi 

debt 



For further details of storage references to direct- 
store segments, refer to Book IV, PowerPC Implemen¬ 
tation Features. 


12.7 Block Address Translation 

The Block Address Translation (BAT) mechanism pro¬ 
vides a means for mapping ranges of virtual 
addresses larger than a single page onto contiguous 
areas of real storage. Such areas can be used for 
data that is not subject to normal virtual storage han¬ 
dling (paging), such as a memory-mapped display 
buffer or an extremely large array of numerical data. 

12.7.1 Recognition of Addresses in 
BAT Areas 

Block Address Translation is enabled only when 
address translation is enabled (MSR iR = 1 or 
MSR dr = 1 or both). 

A set of Special Purpose Registers (SPRs) called BAT 
registers define the starting addresses and sizes of 
BAT areas. The BAT registers are accessed in parallel 
with segmented address translation to determine 
whether a particular EA corresponds to a BAT area. 
If an EA is within a BAT area, the real address for 
storage access is determined as described below. 


It is possible to set up the BAT registers and the seg¬ 
mented address translation mechanism such that a 
particular Effective Address is within a BAT area and 
also is covered by page translation. When this 
happens, the BAT takes precedence over entries in 
the Segment Table or the content of a Segment Reg¬ 
ister (including the T bit). 

- Programming Note - 

It is possible for a BAT area to overlay part of an 
ordinary segment, such that the BAT portion is 
non-pagable while the rest of the segment is 
pageable. If this is done, it is not necessary to 
supply Page Table entries for the portion of the 
segment overlaid by the BAT. 


The BAT areas are defined by pairs of SPRs. These 
SPRs can be read or written by the mfspr and mtspr 
instructions; see page 79. Access to these SPRs is 
privileged. The layout of the BAT registers is shown 
in figure 63 for 64-bit implementations and in figure 64 
for 32-bit implementations. 

Four pairs of BAT registers are provided for trans¬ 
lating instruction addresses ^the IBAT registers), and 
four pairs are provided for translating data addresses 
(the DBAT registers). 

- Programming Note - 

If the same storage address is to be mapped via 
BAT for both l-fetch and data load and store, it is 
necessary to load the mapping into both an IBAT 
pair and a DBAT pair. This is true even on an 
implementation that does not have split I and D 
caches. 


It is an error for system software to set up the BAT 
registers such that an Effective Address is translated 
by more than one IBAT pair or by more than one 
DBAT pair. If this occurs, the results are undefined 
and may include a violation of the storage protection 
mechanism, a Machine Check interrupt, or a 
Checkstop. 

Each pair of BAT registers defines the starting 
address of a BAT area in Effective Address space, the 
length of the area, and the start of the corresponding 
area in Real Address space. If an Effective Address 
is within the range of EAs defined by a pair of BAT 
registers that is valid (see below) for the access, its 
Real Address is developed by (conceptually) sub¬ 
tracting the starting effective address of the BAT area 
from the EA and adding the starting real address of 
the BAT area. 

BAT areas are restricted to a finite set of allowable 
lengths, all of which are powers of 2. The smallest 
BAT area defined is 128 KB (2 17 bytes). The largest 
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BAT area defined is 256 MB (2 a bytes). The starting 
address of a BAT area in both EA space and RA 
space must be a multiple of the area's length. 

12.7.2 BAT Registers 

See section “Move To Special Purpose Register 
XFX-form” on page 79 for a list of the SPR numbers 
for the BAT registers. See Appendix C, “Assembler 
Extended Mnemonics” on page 221 for a list of 
extended mnemonics for use with the BAT registers. 
The equation for determining whether a BAT entry is 
valid for a particular access is: 

BAT_entry_val i d = (V s & ->MSR pr ) I (V p & MSR pr ) 

If a BAT entry is not valid for a given access, it does 
not participate in address translation for that access. 


Upper BAT Register 

0 46 51 62 63 


BEPI 

III 

BL 

0 

0 

BRPN 

III 

WIMG 

0 

PP 


0 46 57 60 62 63 

Lower BAT Register 


Reg 

Bit 

Name 

Description 

Upper 

0:46 

BEPI 

Block Effective Page Index 


51:61 

BL 

Block Length 


62 

V s 

Supervisor state valid bit 


63 

v P 

Problem state valid bit 

Lower 

0:46 

BRPN 

Block Real Page Number 


57:60 

WIMG 

Storage access controls 

Bit 60 is reserved in IBATs. 


62:63 

PP 

Protection bits for BAT area 


Two BAT entries may not map an overlapping effec¬ 
tive address range and be valid at the same time. 

- Programming Note - 

Entries that have complementary settings of V s 
and V p may map overlapping effective address 
blocks. Complementary settings would be: 

BAT entry A: V s = 1, V p = 0 
BAT entry B: V s = 0, V p = 1 


The BL field in the upper BAT register is a mask that 
encodes the length of the BAT area. 


BAT Area 
Length 

BL 

128 KB 

000 0000 0000 

256 KB 

000 0000 0001 

512 KB 

000 0000 0011 

1 MB 

000 0000 0111 

2 MB 

000 0000 1111 

4 MB 

000 0001 1111 

8 MB 

000 0011 1111 

16 MB 

000 0111 1111 

32 MB 

000 1111 1111 

64 MB 

001 1111 1111 

128 MB 

011 1111 1111 

256 MB 

111 1111 1111 


Only the values shown are valid for BL The rightmost 
bit of BL is aligned with bit 46 {14} of the EA. 

An Effective Address is determined to be within a BAT 
area if EA matches BEP1. The boundary between the 
string of Os and the string of Is in BL determines the 
bits of EA that participate in the comparison with 


All other fields are reserved. 


Figure 63. BAT Registers, 64-bit implementations 

BEPI: bits in EA corresponding to Is in BL are forced 
to 0 for this comparison. 

Bits in EA corresponding to Is in BL concatenated 
with the 17 bits of EA to the right of BL, form the 
offset within the BAT area. 

- Programming Note - 

The value loaded into BL determines both the 
length of the BAT area and the alignment of the 
area in both EA space and RA space. It is a pro¬ 
gramming error if the value loaded into BL is not 
one of those given in the table above, or if the 
values loaded into BEPI and BRPN do not have at 
least as many low-order Os as there are Is in BL 


Upper BAT Register 

0 14 19 30 31 


BEPI 

III 

BL 

2 

i 

BRPN 

III 

WIMG 

D 

m 


0 14 25 28 30 31 

Lower BAT Register 


Reg 

Bit 

Name 

Description 

Upper 

0:14 

BEPI 

Block Effective Page Index 


19:29 

BL 

Block Length 


30 

V s 

Supervisor state valid bit 


31 

v P 

Problem state valid bit 

Lower 

0:14 

BRPN 

Block Real Page Number 


25:28 

WIMG 

Storage access controls 

Bit 28 is reserved in IBATs. 


30:31 

PP 

Protection bits for BAT area 


All other fields are reserved. 


Figure 64. BAT Registers, 32-bit implementations 
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Figure 65. Formation of Real Address via BAT, 64-bit 
implementations 

12.7.2.1 BAT Storage Protection 

If an Effective Address is determined to be within a 
BAT area that is valid for the access, the access is 
next validated by the storage protection scheme 
described in section 12.10.2, “BAT Protection” on 
page 180. If this protection mechanism rejects the 
EA, a page fault (Data Storage interrupt or Instruction 
Storage interrupt) is generated. 

12.7.2.2 BAT Real Address 

If the protection mechanism accepts the access, then 
a Real Address is formed as shown in figure 65 for 
64-bit implementations, and figure 66 for 32-bit imple¬ 
mentations. 

Access to the real memory of the BAT area is made 
according to the storage mode defined by the “WIMG" 
bits in the lower BAT register. These bits apply to the 
entire BAT area rather than to an individual page. 
See 12.8.2, “Supported Storage Modes” on page 177 
for an explanation of these bits. 



Figure 66. Formation of Real Address via BAT, 32-bit 
implementations. 

12.8 Storage Access Modes 

When address relocation is enabled and the effective 
address generated by a storage access is translated 
by the Segmented Address Translation mechanism or 
by the Block Address Translation mechanism, the 
access is performed under the control of the Page 
Table Entry or BAT entry used to translate the effec¬ 
tive address. Each Page Table Entry or DBAT entry 
contains four mode control bits, W, /, M, and G, that 
specify the storage mode for all accesses translated 
by the entry. The I BAT entry contains the W, /, and M 
bits, but not the G bit. The W and / bits control how 
the processor executing the access uses its own 
cache. The M bit specifies whether the processor 
executing the access must use the storage coherence 
protocol to ensure that all copies of the addressed 
storage location are made consistent. The G bit con¬ 
trols whether or not speculative data and instruction 
fetching is permitted. For an access translated by an 
IBAT entry, G is assumed to be 0. 

The mode control bits only have meaning when an 
effective address is translated in the processor per¬ 
forming a storage access. When an access is per¬ 
formed for which coherence is required, the processor 
performing the access must inform the coherence 
mechanism that the access requires memory coher¬ 
ence. Other processors affected by the access must 
respond to the coherence mechanism. However since 
these mode control bits are only relevant when an 
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effective address is translated and have no direct 
relation to data in the cache, processors responding 
to the coherence request are able to respond without 
knowledge of the state of these bits. 

12.8.1 W, I, M and G bits 

The W, /, M, and G bits in a Page Table Entry or DBAT 
entry, or the W, /, and M bits in an IBAT entry, control 
the way in which the processor accesses cache and 
main storage. Each bit controls a separate aspect of 
storage references. 

W Write Through 

If the data is in the cache, a store must update 
that copy of the data. In addition, if W = 1 the 
update must be written to the home storage 
location (see below). 

Store combining optimizations are allowed 
except when the store instructions are sepa¬ 
rated by sync or e/e/o. The architecture pre¬ 
sumes that data present in the cache is valid 
and a store may cause any part of that data to 
be copied back to main storage. 

The definition of the home storage location is 
dependent upon the implementation of the 
memory system but can be illustrated by the 
following examples: 

■ RAM Storage 

The store must be sent to the RAM con¬ 
troller to be written into the target RAM. 

■ I/O Adapter Card 

the store must be sent to the adapter card 
to be written to the target register or 
storage location. 

In systems with multilevel caching, the store 
must be written to at least a depth in the 
memory hierarchy that is seen by all 
processors and devices. 

I Caching Inhibited 

If 1 = 1, the storage access is completed by ref¬ 
erencing the location in main storage, 
bypassing the cache. During the access, the 
accessed location is not brought into the cache 
nor is the location allocated in the cache. It is 
considered a programming error if a copy of 
the target location of an access to Caching 
Inhibited storage is in the cache. Software 
must ensure that the location has not previ¬ 
ously been brought into the cache or, if it has, 
that it has been flushed from the cache. If the 
programming error occurs, the result of the 
access is boundedly undefined. 

Load/store combining optimizations are 
allowed except when the accesses are sepa¬ 
rated by sync, or by e/e/o when the storage 
access is also Guarded. 


M Memory Coherence 

This mode control is provided to allow 
improved performance in systems in which 
accesses to storage kept consistent by hard¬ 
ware is slower than accesses to storage not 
kept consistent by hardware, and in which soft¬ 
ware is able to enforce the required consist¬ 
ency. When the mode is off (M = 0), the 
hardware need not enforce data coherence. 
When the mode is on (M = 1), the hardware 
must enforce data coherence. Because 
instruction storage need not be consistent with 
data storage, it is permissible for an imple¬ 
mentation to ignore the M bit for instruction 
fetches. 

G Guarded Storage 

If G = 1, accesses to storage must conform to 
the restrictions described in Section 12.2.5, 
“Speculative Execution” on page 157. 

12.8.2 Supported Storage Modes 

The combinations of the Write Through bit, the 
Caching Inhibited bit, and the Memory Coherence bit 
define eight different storage modes. Six of these 
modes are supported. For each, the G bit may be 0 
or 1. 

- WIM - 000 

1. Data may be cached. 

2. Loads or stores for which the target location 
is in the cache may use that copy of the 
location. 

3. Exclusive ownership of the block containing 
the target location is not required for store 
accesses and consistency operations for the 
block may be ignored when fetching the 
block, storing it back, or changing its state 
from shared to exclusive. 

■ WIM = 001 

1. Data may be cached. 

2. Loads or stores for which the target location 
is in the cache may use that copy of the 
location. 

3. Exclusive ownership of the block containing 
the target location is required before store 
accesses are allowed. When fetching the 
block, the processor must indicate that con¬ 
sistency is to be enforced on the bus trans¬ 
action. If the state of the block is read 
shared, the processor must gain exclusive 
use of the block before storing into it. 

■ WIM = 010 

Caching is inhibited. The storage access goes to 
storage bypassing the cache. Hardware enforced 
storage consistency is not required. 
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■ WIM = Oil 

Caching is inhibited. The storage access goes to 
storage bypassing the cache. Storage consist¬ 
ency is enforced by hardware. 

■ WIM = 100 

1. Data may be cached. 

2. Loads for which the target location is in the 
cache may use that copy of the location. 

3. Stores must be written to main storage. The 
target location of the store may be cached 
and must be updated if there. 

4. Exclusive ownership of the block containing 
the target location is not required for store 
accesses and consistency operations for the 
block may be ignored when fetching the 
block, storing it back, or changing its state 
from shared to exclusive. 

■ WIM = 101 

1. Data may be cached. 

2. Loads for which the target location is in the 
cache may use that copy of the location. 

3. Stores must be written to main storage. The 
target location of the store may be cached 
and must be updated if there. 

4. Exclusive ownership of the block containing 
the target location is required before store 
accesses are allowed. When fetching the 
block, the processor must indicate that con¬ 
sistency is to be enforced on the bus trans¬ 
action. If the state of the block is read 
shared, the processor must gain exclusive 
use of the block before storing into it. 

■ WIM = 110 

This mode would represent memory that is Write 
Through, Caching Inhibited, and Memory Coher¬ 
ence Not Required. This mode is not supported. 

■ WIM = 111 

This mode would represent memory that is Write 
Through, Caching Inhibited, and Memory Coher¬ 
ence Required. This mode is not supported. 

12.8.3 Mismatched WIMG Bits 

Accesses to the same storage location using two 
effective addresses for which the Write Through mode 
(W bit) differs must meet the Memory Coherence 
requirements described in Part 2, “PowerPC Virtual 
Environment Architecture” on page 117. 


12.9 Reference and Change 
Recording 

If address translation is enabled (MSR iR = 1 or 
MSR dr = 1), Reference (R) and Change (C) bits are 
maintained in the Page Table Entry for each real page 
for accesses due to segment and page table address 
translation. Reference and change recording is not 
performed for translations due to BAT or for direct- 
store (T= 1) segments. 

The R and C bits are set automatically by hardware or 
by software assist in conjunction with normal Page 
Table processing as follows: 

Reference bit 

As a result of page table processing for a 
storage access (load, store, or cache instruc¬ 
tion, or instruction fetch), the Reference bit may 
be set to 1 immediately or its setting may be 
delayed until the storage access is determined 
to be successful. 

The Reference bit may be set for a specula¬ 
tively executed access. The Reference bit may 
also be set for accesses that are not performed 
when the access is prohibited by page pro¬ 
tection, or if the access is the result of a string 
operation of zero length, or if the access is a 
Store Conditional but no store is performed 
because a reservation does not exist. 

Change Bit 

Whenever a data store is executed successfully, 
as part of the TLB look-up procedure the 
Change bit in the TLB is checked. If it is already 
set to 1, no further action is taken. If the TLB 
Change bit is 0, it is set to 1 and the corre¬ 
sponding Change bit in the Page Table Entry is 
set to 1. 

The PowerPC Architecture requires that the 
Change bit be set to 1 only if the store is 
allowed by storage protection and all branches 
prior to the store that will cause the Change bit 
to be set have been resolved and it has been 
determined that the store is on the path that is 
to be executed. 

Furthermore, the Change bit may be set even 
when a store is not performed successfully in 
the following cases: 

1. A Store Conditional (stwcx. or stdcx.) is 
executed and is allowed by the storage pro¬ 
tection mechanism, but no store is per¬ 
formed because a reservation does not 
exist. 


2. A Store String Word Indexed (stswx) is exe¬ 
cuted and is allowed by the storage pro¬ 
tection mechanism, but no store is 
performed because the length is zero. 
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3. The store operation is not performed 
because the instruction stream is inter¬ 
rupted before the store is performed. 

Execution of either of the Data Cache Block Touch 
instructions (debt, debtst) may result in setting the R 
bit for a page. Neither instruction may result in 
setting the C bit for a page. 

See section 12.12, “Table Update Synchronization 
Requirements” on page 186 for the rules software 
must follow when updating the Reference and Change 
bits in the Page Table. 

12.9.1 Synchronization of Reference 
and Change Bit Updates 

If processor A executes a load or store that causes a 
Reference bit and/or Change bit update, the following 
conditions must be met with respect to setting of the 
bits and performing the access: 

1. If processor A subsequently executes a sync, 
both the updates to the bits and the access must 
be performed with respect to all other processors 
and mechanisms before the sync completes on 
processor A. 

2. If processor B subsequently executes a tlbie that 
invalidates the TLB entry in processor A that was 
used to translate the access, and processor B 
then executes a tlbsync that is broadcast, both 
the updates to the bits and the access must be 
performed with respect to all other processors 
and mechanisms before the tlbsync completes on 
processor A. 

Updates to the Reference and Change bits may not 
be immediately visible to the program after executing 
a load or store that sets them indirectly. 

- Programming Note - 

If it is important that the program that loads from 
the PTE retrieve the correct R and C bits, a sync 
instruction must be executed between a load or 
store that indirectly sets an R or C bit, and the 
load of these bits from the PTE. 


— Programming Note - 

On systems with Translation Lookaside Buffers, 
the Reference and Change bits are only set on the 
basis of TLB activity. When software resets these 
bits to zero it must synchronize the TLB's actions 
by invalidating the TLB entries associated with 
the pages whose Reference and Change bits were 
reset. 


12.10 Storage Protection 

The storage protection mechanism provides a means 
for selectively granting read access, granting 
read/write access, and prohibiting access to areas of 
storage based on a number of control criteria. 

Since the protection mechanism operates as part of 
the address translation mechanism, storage pro¬ 
tection applies to translated accesses only. Instruc¬ 
tion storage access protection is active only when 
MSR| R —1. Data storage access protection is active 
only when MSR DR = 1. 

A page (4 KB) crossing is relevant to performance 
and instruction restart when it corresponds to a pro¬ 
tection boundary. Crossing a 4 KB boundary in an 
area mapped by Block Address Translation or in a 
direct-store segment should have no effect on per¬ 
formance and should not cause an instruction restart. 

For ordinary translated accesses to memory via the 
Page Table, the Page Protection mechanism described 
in the next section is active. Different mechanisms 
are used for Block Address Translation (BAT) 
accesses (see section 12.10.2, “BAT Protection” on 
page 180) and for Direct-store segments (see section 
12.6.2, “Direct-store segment protection" on 
page 173). 

12.10.1 Page Protection 

The page protection mechanism provides protection 
at the granularity of a page (4 KB). It is controlled by 
the following inputs: 

■ MSR pr , which distinguishes between supervisor 
state and problem state. 

■ K s and K p , supervisor and problem key bits in the 
Segment Table Entry or Segment Register. 

■ PP bits in the Page Table Entry. 

A reference made via the segmented address trans¬ 
lation mechanism is associated with a Segment Table 
Entry (STE) and a Page Table Entry (PTE) by the 
address translation mechanism. The K bits, the PP 
bits, and the MSR pr bit are used as follows: 

A Key value is developed according to the following 
formula: 

Key «- (K p & MSR pr ) I (Ks & -MSR pr ) 

Using the generated Key, the following table is 
applied: 

When a reference is not permitted because of the pro¬ 
tection mechanism one of the following occurs. 

■ Data Storage interrupt is generated and bit 4 of 
the DSISR is set to 1. 
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Key 

PP 

Page Type 

Load 

Access 

Permitted 

Store 

Access 

Permitted 

0 

00 

read/write 



0 

01 

read/write 

• ■ 

yes 

0 

10 

read/write 

Eli^H 

yes 

0 

11 

read only 

yes 

no 

1 

00 

no access 

no 

no 

1 

01 

read only 

yes 

no 

1 

10 

read/write 

yes 

yes 

1 

11 

read only 

yes 

no 


Key Key selected by state of MSR pr bit 
PP PTE page protect bits 

Figure 67. Protection Key Processing 

■ Instruction Storage interrupt is generated and bit 
36 {4} of SRR1 is set to 1. 


— Programming Note - 

A store that is not permitted because of the 
storage protection mechanism will not cause a 
Change bit to be set in a PTE; such an access may 
cause a Reference bit to be set in a PTE. 


12.10.2 BAT Protection 

The BAT protection mechanism operates on an entire 
BAT area, not on individual pages. If an Effective 
Address is determined to be within a BAT area that is 
valid for the access, the operations described above 
in section 12.10.1, “Page Protection” on page 179 are 
performed, with these exceptions: 

■ For BATs, no Key value is defined; Figure 67 is 
used with an assumed Key = 1. 

■ The PP bits from the lower BAT register are used, 
not bits from a Page Table Entry. 


180 PowerPC Architecture First Edition 










12.11 Storage Control 
Instructions 

12.11.1 Cache Management 
Instructions 

This section contains the only privileged cache man¬ 
agement instruction and additional specifications for 
the other cache management instructions described in 
Part 2, “PowerPC Virtual Environment Architecture” 
on page 117. See that document for further details. 

If the effective address references a direct-store 
segment, the instruction is treated as a no-op. 

When data relocate is off, MSR DR = 0, the & ata Cache 
Block set to Zero instruction establishes a block in 
the cache and may not verify that the real address is 
valid. If a block is created for an invalid real address, 
a Machine Check may result when an attempt is made 
to write that block back to storage. The block could 
be written back as the result of the execution of an 
instruction that causes a cache miss and the invalid 
address block is the target for replacement or as the 
result of a Data Cache Block Store instruction. 


Data Cache Block Invalidate X-form 


dcbi RA,RB 


31 

III 

RA 

RB 

470 

/ 

0 

6 

ii 

16 

21 

31 


Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The action taken is dependent on the storage mode 
associated with the target, and the state of the block. 
The list below describes the action to take if the block 
containing the byte addressed by EA is or is not in the 
cache. 

1. Coherence Not Required 
Unmodified Block 

Invalidate the block in the local cache. 

Modified Block 

Invalidate the block in the local cache. (Discard 
the modified contents.) 

Absent Block 

No action is taken. 

2. Coherence Required 
Unmodified Block 

Invalidate copies of the block in the caches of 
all processors. 

Modified Block 

Invalidate copies of the block in the caches of 
all processors. (Discard the modified con¬ 
tents.) 

Absent Block 

If copies are in the caches of any other 
processor, cause the copies to be invalidated. 
(Discard any modified contents.) 

When data address translation is enabled, MSR dr = 1, 
and the virtual address has no translation a Data 
Storage Interrupt occurs. See 13.5.3, “Data Storage 
Interrupt” on page 194. 

The function of this instruction is independent of the 
Write Through and Caching Inhibited/Allowed modes 
of the block containing the byte addressed by EA. 

This instruction is treated as a store to the addressed 
byte with respect to address translation and pro¬ 
tection. The Reference bit for EA may be set, the Ref¬ 
erence and Change bits may be set, or neither may 
be set. 

This instruction is privileged. 


Special Registers Altered: 
None 
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12.11.2 Segment Register Manipulation Instructions 


Move To Segment Register X-form 

mtsr SR,RS 


31 

RS 

/ 

SR 

III 

210 

/ 

0 

6 

11 

12 

16 

21 

31 


SEGREG(SR) «- (RS) 

The contents of register RS is placed into Segment 
Register SR. 

This instruction is privileged. 

This instruction is defined only for 32-bit implementa¬ 
tions. Using it on a 64-bit implementation will cause 
an Illegal Instruction type Program interrupt. 

Special Registers Altered: 

None 


Move From Segment Register X-form 

mfsr RT,SR 


31 

RT 

R 

SR 

III 

595 

/ 

0 

6 


12 

16 

21 

31 


RT <- SEGREG(SR) 

The contents of Segment Register SR is placed into 
register RT. 

This instruction is privileged. 

This instruction is defined only for 32-bit implementa¬ 
tions. Using it on a 64-bit implementation will cause 
an Illegal Instruction type Program interrupt. 

Special Registers Altered: 

None 


— Programming Note - 

For a discussion of software synchronization 
requirements when altering Segment Registers, 
please refer to Appendix L, “Synchronization 
Requirements for Special Registers” on page 269. 


Move To Segment Register Indirect 
X-form 


mtsrin RS,RB 
[Power mnemonic: mtsri] 


31 

RS 

III 

RB 

242 

/ 

0 

6 

11 

16 

21 

31 


SEGREG((RB) 0:3 ) <- (RS) 

The contents of register RS are copied to the 
Segment Register selected by bits 0:3 of register RB. 

This instruction is privileged. 

This instruction is defined only for 32-bit implementa¬ 
tions. Using it on a 64-bit implementation will cause 
an Illegal Instruction type Program interrupt. 

Special Registers Altered: 

None 

Move From Segment Register Indirect 
X-form 


mfsrin RT,RB 


31 

RT 

III 

RB 

659 

/ 

0 

6 

11 

16 

21 

31 


RT <- SEGREG((RB) 0:3 ) 

The contents of the Segment Register selected by bits 
0:3 of register RB are copied into register RT. 

This instruction is privileged. 

This instruction is defined only for 32-bit implementa¬ 
tions. Using it on a 64-bit implementation will cause 
an Illegal Instruction type Program interrupt. 

Special Registers Altered: 

None 

- Programming Note - 

The RA field is not defined for the mtsrin and 
mfsrin instructions in this architecture. However, 
mtsrin and mfsrin will perform the same function 
in PowerPC as do mtsri and mfsri in Power if RA 
is 0 in the Power instructions. 
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12.11.3 Lookaside Buffer 
Management Instructions (Optional) 

While the PowerPC Architecture describes logically 
separate instruction fetch and fixed-point (including 
effective address computation) execution units, the 
programming model is that there is one translation 
mechanism and, for 32-bit implementations, one set of 
segment registers. 

For performance reasons, most implementations will 
implement a Segment Lookaside Buffer (64-bit imple¬ 
mentations) and a Translation Lookaside Buffer. 
These are caches of portions of the Segment Table 
and Page Table respectively. As changes are made 
to the address translation tables, it is necessary to 
force the SLB and TLB into line with the updated 
tables. This is done by invalidating SLB and TLB 
entries, or occasionally by invalidating the entire SLB 
or TLB, and allowing the translation caching mech¬ 
anism to re-fetch from the tables. 

Each PowerPC implementation which has an SLB must 
provide means for doing the following: 

■ Invalidating an individual SLB entry 

■ Invalidating the entire SLB 

Each PowerPC implementation which has a TLB must 
provide means for doing the following: 

■ Invalidating an individual TLB entry 

■ Invalidating the entire TLB 

An implementation may choose to provide one or 
more of the instructions listed in this section in order 
to satisfy requirements in the preceding list. If an 


instruction is implemented that matches the seman¬ 
tics of an instruction described here, the implementa¬ 
tion should be as specified here. Alternatively, an 
algorithm may be given that performs one of the func¬ 
tions listed above (a loop invalidating individual SLB 
entries may be used to invalidate the entire SLB, for 
example). Or instructions with different semantics may 
be implemented. Such algorithms or instructions 
must be described in Book IV, PowerPC Implementa¬ 
tion Features. 

It is permissible for an instruction described here to 
be implemented so that more is done than absolutely 
required. For example, an instruction whose seman¬ 
tics are to purge an SLB entry may be implemented 
so as to purge an entire congruence class or perhaps 
even the entire SLB. Such additional actions should 
be described in Book IV. 

If a 64-bit implementation does not implement an 
SLB, it does not provide the optional instructions that 
affect the SLB ( slbie and slbia). In such an implemen¬ 
tation, it is permissible to treat these SLB instructions 
as no-ops. Similarly, if the implementation does not 
implement a TLB, it does not provide the optional 
instructions that affect the TLB (tlbie, tibia, and 
tlbsync). In such an implementation, it is permissible 
to treat these TLB instructions as no-ops. 


— Programming Note - 

Because the presence, absence, and exact 
semantics of the various Lookaside Buffer man¬ 
agement instructions are model dependent, it is 
recommended that system software 
“encapsulate” uses of such instructions into sub¬ 
routines to minimize the impact of moving from 
one implementation to another. 
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SLB Invalidate Entry X-form 


SLB Invalidate All X-form 


slbie RB slbia 


mm 

ill 
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498 

/ 

mm 

6 

11 

16 

21 

31 


31 

III 

III 

RB 

434 

/ 

0 

6 
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21 

31 


EA <- (RB) 

if SLB entry exists for EA, then 
SLB entry <- invalid 

Let the effective address (EA) be the contents of reg¬ 
ister RB. if the Segment Lookaside Buffer (SLB) con¬ 
tains an entry corresponding to EA, that entry is made 
invalid (i.e., removed from the SLB). 

The SLB search is done regardless of the settings of 
MSR| R and MSRq R . 

Block Address Translation for EA, if any, is ignored. 

This instruction is privileged. 

This instruction is optional in PowerPC Architecture. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
an Illegal Instruction type Program interrupt. 

Special Registers Altered: 

None 

- Programming Note - 

It is not necessary that the ASR point to a valid 
Segment Table when issuing slbie. 


All SLB entries «■ invalid 

The entire SLB is made invalid (i.e., all entries are 
removed). 

The SLB is invalidated regardless of the settings of 
MSR| R and MSRq R . 

This instruction is privileged. 

This instruction is optional in PowerPC Architecture. 

This instruction is defined only for 64-bit implementa¬ 
tions. Using it on a 32-bit implementation will cause 
an Illegal Instruction type Program interrupt. 

Special Registers Altered: 

None 

- Programming Note - 

It is not necessary that the ASR point to a valid 
Segment Table when issuing slbia. 
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TLB Invalidate Entry X-form 


tlbie RB 

[Power mnemonic: tlbi] 
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III 

III 

RB 
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16 
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31 


VPI <r (RB) 36;51 {4;19 j 

Identify TLB entries corresponding to VPI 
Each such TLB entry «• invalid 

Let the effective address (EA) be the contents of reg¬ 
ister RB. If the Translation Lookaside Buffer (TLB) 
contains an entry corresponding to EA, that entry is 
made invalid (i.e., removed from the TLB). 

The TLB search is done regardless of the settings of 
MSR| R and MSR dr . The search is done based on a 
portion of the Virtual Page Index, including the least 
significant bits, without reference to the SLB, segment 
table, or segment register. All entries matching the 
search criteria are invalidated. 

Block Address Translation for EA, if any, is ignored. 

This instruction is privileged. 

This instruction is optional in PowerPC Architecture. 

See Section 12.12, “Table Update Synchronization 
Requirements” on page 186 for a description of other 
requirements associated with the use of this instruc¬ 
tion. 

Special Registers Altered: 

None 

- Programming Notes - 

Nothing is guaranteed about instruction fetching in 
other processors if tlbie deletes the TLB entry for 
the page in which some other processor is cur¬ 
rently executing. 


TLB Invalidate All X-form 


tibia 


31 

III 

III 
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31 


All TLB entries «- invalid 

The entire TLB is invalidated (i.e., all entries are 
removed). 

The TLB is invalidated regardless of the settings of 
MSR| R and MSR^ r . 

This instruction is privileged. 

This instruction is optional in PowerPC Architecture. 

Special Registers Altered: 

None 


— Programming Notes - 

It is not necessary that the ASR point to a valid 
Segment Table or that SDR 1 point to a valid 
page table when issuing tibia. 

Nothing is guaranteed about instruction fetching in 
other processors if tlbie deletes the TLB entry for 
the page in which some other processor is cur¬ 
rently executing. 
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TLB Synchronize X-form 


tlbsync 
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The tlbsync instruction watte does not complete until 
all previous tlbie and tibia instructions executed by 
the processor executing this instruction have been 
received and completed by all other processors. 

This instruction is privileged. 

This instruction is optional in PowerPC Architecture, 
but it must be implemented if any of the following are 
true: 

■ A TLB invalidation instruction that broadcasts is 
implemented. 

■ The eciwx or ecowx instructions are imple¬ 
mented. 

See Section 12.12, “Table Update Synchronization 
Requirements” for a description of other require¬ 
ments associated with the use of this instruction. 

Special Registers Altered: 

None 


12.12 Table Update 
Synchronization Requirements 

This section describes the steps that software must 
take when updating the tables involved in address 
translation. Updates to these tables include: 

■ Adding a new Page Table Entry (PTE). 

■ Modifying an existing PTE, including the special 
case of modifying the PTE's Reference bit. 

■ Deleting a PTE. 

■ Adding a new Segment Table Entry (STE). 

■ Modifying an existing STE. 

■ Deleting a STE. 

In a multiprocessor system it is critical that these 
rules be followed to ensure that all processors see a 
consistent set of tables. Even in a uniprocessor 
system certain rules must be followed, notably those 
regarding Reference and Change bit updates, because 
software changes must be synchronized with auto¬ 
matic updates by the hardware. 

A sync instruction ensures that ail prior tlbie 
instructions executed by the processor executing the 
sync instruction have completed on that processor. 

To ensure that a tlbie instruction executed by one 
processor has completed on all other processors, the 
sequence tlbie followed by sync is not sufficient. This 
sequence must be followed by a tlbsync instruction 
and then a sync instruction on the processor that exe¬ 
cuted the tlbie to ensure that 

1. the prior tlbie instructions have completed on 
other processors, and 

2. the tlbsync has completed on the processor exe¬ 
cuting this sequence. 

When tlbie is executed on one processor, software 
must ensure that the following sequence of 
instructions is executed on that processor before a 
tlbie is executed on a second processor. 

1 . sync 

2. tlbsync 

3. sync 

Other instructions may be interleaved with this 
sequence of instructions but these instructions must 
appear in the sequence in the order shown. 
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12.12.1 Page Table Updates 

HTAB entries must be locked on multiprocessors. 
Access to HTAB entries must be appropriately syn¬ 
chronized by software locking of (i.e., guaranteeing 
exclusive access to) entries or groups of entries if 
more than one processor can modify the table at 
once. 

On uniprocessors, HTAB entries need not be locked. 
To adapt the examples given below for the 
uniprocessor case, simply delete the “lockQ” and 
“unlock()” lines. The sync instructions shown are still 
required even on uniprocessors. 

TLBs are non-coherent caches of the HTAB. TLB 
entries must be flushed explicitly with one of the TLB 
invalidate instructions. The sync instruction waits 
until all prior TLB invalidates by this processor are 
complete. This may cost a sync per HTAB entry 
update. 

Unsynchronized lookups in the HTAB continue even 
while it is being modified. Any processor, even 
including the processor modifying the HTAB, may look 
in the HTAB at any time in an attempt to reload a TLB 
entry. An inconsistent HTAB entry must never acci¬ 
dentally become visible, thus there must be synchro¬ 
nization between modifications to the valid bit and 
any other modifications. This costs as many as two 
syncs per HTAB entry update. 

Processors write Reference and Change bits with 
unsynchronized atomic byte stores. This requires that 
the V, R, and C bits be in distinct bytes. It also 
requires extreme care to ensure that no store over¬ 
writes one of these bytes accidentally. 

In the examples below, 

■ “lock()” and “unlockQ” refer to software locks for 
exclusive access to the table entry in question, 

■ sync refers to the sync instruction, 

■ tlbsync refers to the tlbsync instruction, and 

■ tlbie refers to the tlbie instruction. 

12.12.1.1 Adding a Page Table Entry 

This is the simplest Page Table case. It requires no 
synchronization with the hardware, just a lock on the 
PTE in a multiprocessor system. We fill in the entries 
in the PTE except for the Valid bit, issue a sync to 
ensure that the updates have all made it to storage, 
and turn on the Valid bit. 

lock(PTE) 

PTEvsid, h,api new values 

P te rpn,r,c,wim,pp *■ new values 

sync 

PTE V 1 

unlock(PTE) 


12.12.1.2 Modifying a Page Table Entry 
General case 

In this case a currently-valid PTE must be changed. 
To do this we must lock the PTE, mark it invalid, flush 
it from the TLB, update the information in the PTE, 
mark it valid again, and unlock, using sync at appro¬ 
priate times to wait for modifications to complete. 

lock(PTE) 

PTE V «- 0 
sync 

tlbie(PTE) 

sync 

tlbsync 

sync 

PTEvsid, HAP i *- new values 

PTErpnrc.wim.pp *■ new values 

sync 

PTE V <- 1 

unlock(PTE) 

Resetting the Reference bit 

In the case where the PTE is modified only to set the 
Reference bit to 0, a much simpler algorithm suffices 
because the Reference bit need not be maintained 
exactly. 

lock(PTE) 
oldR *- PTE r 
if oldR = 1 then 
PTE R «- 0 
tlbie(PTE) 
unlock(PTE) 

Since only the R and C bits are modified by hardware, 
and since R and C are in different bytes, the R bit can 
be set to 0 by reading the current contents of the byte 
in the PTE containing R (bits 48:55 of the second 
doubleword on 64-bit implementations, bits 16:23 of 
the second word on 32-bit implementations), AN Ding 
the value with OxFE, and storing the byte back into 
the PTE. 

Modifying the virtual address 

If the virtual address is being changed to a different 
address within the same TLB hash class and cache 
hash class, it suffices to: 

lock(PTE) 

val «- PTEvsjQ Api ny 
insert new VSID into val 

pte vsid,api,h,v val 
sync 

tlbie(PTE) 

sync 

tlbsync 

sync 

unlock(PTE) 
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Here we take advantage of the fact that the store into 
the first doubleword of the PTE (word, on 32-bit 
systems) is performed atomically. 

Note that if the new address is not a cache synonym 
of the old, it will be necessary to flush or invalidate 
the page in the cache(s) as well. This may involve 
assigning a temporary virtual address that is such a 
synonym, and using that address to do the cache 
operations. 

12.12.1.3 Deleting a Page Table Entry 

Here we just lock the entry, mark it invalid, wait for 
the change to complete, and unlock. 

lock(PTE) 

PTE V «• 0 

sync 

tlbie(PTE) 

sync 

tlbsync 

sync 

unlock(PTE) 

12.12.2 Segment Table Updates 

These updates are similar to Page Table updates, but 
without the complication of hardware updates to Ref¬ 
erence and Change bits. 

STAB entries must be locked on multiprocessors. 
Access to STAB entries must be appropriately syn¬ 
chronized by software locking of (i.e., guaranteeing 
exclusive access to) entries or groups of entries if 
more than one processor can modify the table at 
once. 

On uniprocessors, STAB entries need not be locked. 
To adapt the examples given below for the 
uniprocessor case, simply delete the “lock()” and 
“unlock()” lines. The sync instructions shown are still 
required even on uniprocessors. 

SLBs are non-coherent caches of the STAB. SLB 
entries must be flushed explicitly with one of the SLB 
invalidate instructions. The sync instruction waits 
until all prior SLB invalidates by this processor are 
complete. This may cost a sync per STAB entry 
update. 

Unsynchronized lookups in the STAB continue even 
while it is being modified. Any processor, even 
including the processor modifying the STAB, may look 
in the STAB at any time in an attempt to reload a SLB 
entry. An inconsistent STAB entry must never acci¬ 
dentally become visible, thus there must be synchro¬ 


nization between modifications to the valid bit and 
any other modifications. This costs as many as two 
syncs per STAB entry update. 

In the examples below, 

■ “lockQ” and “unlockQ” refer to software locks for 
exclusive access to the table entry in question, 

■ sync refers to the sync instruction, and 

■ slbie refers to the slbie instruction. 

12.12.2.1 Adding a Segment Table Entry 

We fill in the entries in the STE except for the Valid 
bit, issue a sync to ensure that the updates have all 
made it to storage, and turn on the Valid bit. 

lock(STE) 

^^ESiDTKsKp * new values 
if T = 0' 

then STE vsid «- new value 
else STE 10 «* new value 
sync 
STE V «■ 1 
unlock(STE) 

12.12.2.2 Modifying a Segment Table 
Entry 

In this case a currently-valid STE must be changed. 
To do this we must lock the STE, mark it invalid, flush 
it from the SLB, update the information in the STE, 
mark it valid again, and unlock, using sync at appro¬ 
priate times to wait for modifications to complete. 

lock(STE) 

STE V «- 0 

sync 

slbie(STE) 

sync 

STE ESiDTKsKp «■ new values 
if T = 0 

then STEysic new value 
else STE )0 «- new value 
sync 
STE V «- 1 
unlock(STE) 

12.12.2.3 Deleting a Segment Table 
Entry 

Here we just lock the entry, mark it invalid, wait for 
the change to complete, and unlock. 

lock(STE) 

STE V «■ 0 

sync 

slbie(STE) 

sync 

unlock(STE) 
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12.12.3 Segment Register Updates 


On an implementation that provides Segment Regis¬ 
ters rather than a Segment Table, there is no table to 
be locked but there are certain synchronization 
requirements that must be satisfied when using the 
Move to Segment Register instructions. See 
Appendix L, “Synchronization Requirements for 
Special Registers” on page 269. 
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Chapter 13. Interrupts 


13.1 Overview 

The PowerPC architecture provides an interrupt mech¬ 
anism to allow the processor to change state as a 
result of external signals, errors, or unusual condi¬ 
tions arising in the execution of instructions. 

System Reset and Machine Check interrupts are not 
ordered. All other interrupts are ordered such that 
only one interrupt is reported, and when it is proc¬ 
essed (taken), no program state is lost. Since 
save/restore registers SRRO and SRR1 are serially 
reusable resources used by most interrupts, program 
state will be lost when an unordered interrupt is 
taken. 


13.2 Interrupt Synchronization 

When an interrupt occurs, SRRO is set to point to an 
instruction such that all preceding instructions have 
completed execution, no subsequent instruction has 
begun execution, and the instruction addressed by 
SRRO may or may not have completed execution, 
depending on the interrupt type. 

All interrupts are context synchronizing, as defined in 
Section 9.7.1, “Context Synchronization” on page 145, 
except that System Reset and Machine Check inter¬ 
rupts need not be context synchronizing if they are 
not recoverable (i.e., if bit 62 {30} of SRR1 is set to 0 
by the interrupt). 


13.3 Interrupt Classes 

Interrupts are classified by whether they are directly 
caused by the execution of an instruction or are 
caused by some other system exception. Those that 
are “system-caused” are: 


■ System Reset 

■ Machine Check 

■ External 

■ Decrementer 

External and Decrementer are maskable interrupts. 
While MSR EE = 0, the interrupt mechanism ignores the 
exceptions that generate these interrupts. Therefore, 
software may delay the generation of these interrupts 
by setting MSR EE = 0 or by failing to set MSR EE =1 
after processing an interrupt. When any interrupt is 
taken, MSR EE is set to 0 by the interrupt mechanism, 
delaying the recognition of any further exceptions 
causing these interrupts. 

System Reset and Machine Check exceptions are not 
maskable. These exceptions will be recognized 
regardless of the setting of the MSR. 

“Instruction-caused" interrupts are further divided 
into two classes, precise and imprecise. 

13.3.1 Precise Interrupt 

Except for the Imprecise Mode Floating-Point Enabled 
Exception interrupt, all instruction-caused interrupts 
are precise. When the execution of an instruction 
causes a precise interrupt, the following conditions 
exist at the interrupt point: 

1. SRRO addresses either the instruction causing the 
exception or the immediately following instruc¬ 
tion. Which instruction is addressed can be 
determined from the interrupt type and status 
bits. 

2. An interrupt is generated such that all 
instructions preceding the instruction causing the 
exception appear to have completed with respect 
to the executing processor. However, some 
storage accesses generated by these preceding 
instructions may not have been performed with 
respect to all other processors and mechanisms. 

3. The instruction causing the exception may not 
have begun execution, may have partially com¬ 
pleted, or may have completed, depending on the 
interrupt type. 
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4. Architecturally, no subsequent instruction has 
begun execution. 

13.3.2 Imprecise Interrupt 

This architecture defines one imprecise interrupt: 

■ Imprecise Mode Floating-Point Enabled Exception 

When the execution of an instruction causes an impre¬ 
cise interrupt, the following conditions exist at the 
interrupt point: 

1. SRRO addresses either the instruction causing the 
exception or some instruction following the 
instruction causing the exception that generated 
the interrupt. 

2. An interrupt is generated such that all 
instructions preceding the instruction addressed 
by SRRO appear to have completed with respect 
to the executing processor. 

3. If the imprecise interrupt is forced, by the context 
synchronizing mechanism, due to an instruction 
that causes another interrupt (e.g., Alignment, 
DSI) then SRRO addresses the interrupt-forcing 
instruction, and the interrupt-forcing instruction 
may have been partially executed (see section 
13.6, “Partially Executed Instructions” on 
page 199). 

4. If the imprecise interrupt is forced, by the exe¬ 
cution synchronizing mechanism, due to exe¬ 
cuting an execution synchronizing instruction 
other than sync or isync, then SRRO addresses 
the interrupt-forcing instruction, and the interrupt¬ 
forcing instruction appears not to have begun 
execution (except for its forcing the imprecise 
interrupt). If the imprecise interrupt is forced by 
a sync or isync instruction, then SRRO may 
address either the sync or isync instruction, or 
the following instruction. 

5. If the imprecise interrupt is not forced by either 
the context or the execution synchronizing mech¬ 
anism, then the instruction addressed by SRRO 
appears not to have begun execution, if it is not 
the excepting instruction. 

6. No instruction following the instruction addressed 
by SRRO appears to have begun execution. 

All Floating-Point Enabled Exception interrupts are 
maskable using the MSR bits FEO and FE1. Although 
these interrupts are maskable, they differ significantly 
from the other maskable interrupts in that the 
masking of these interrupts is usually controlled by 
the application program whereas the masking of 
External and Decrementer interrupts is controlled by 
the operating system. 


13.4 Interrupt Processing 

Associated with each kind of interrupt is an interrupt 
vector, which contains the initial sequence of 
instructions that is executed when the corresponding 
interrupt occurs. 

interrupt processing consists of saving a small part of 
the processor's state in certain registers, identifying 
the cause of the interrupt in another register, and 
continuing execution at the corresponding interrupt 
vector location. When an exception exists that will 
cause an interrupt to be generated and it has been 
determined that the interrupt can be taken, the fol¬ 
lowing actions are performed: 

1. SRRO is loaded with an instruction address that 
depends on the type of interrupt; see the specific 
interrupt description for details. 

2. Bits 33:36 and 42:47 {1:4 and 10:15) of SRR1 are 
loaded with information specific to the interrupt 
type. 

3. Bits 0:32, 37:41, and 48:63 {0, 5:9, and 16:31} of 
SRR1 are loaded with a copy of the corre¬ 
sponding bits of the MSR, except for the Machine 
Check interrupt, for which these bits are set to 
i m pi ementation-dependent val ues. 

4. The MSR is set as described in Figure 68 on 
page 193. The new values take effect beginning 
with the first instruction following the interrupt. 
MSR bits of particular interest are: 

■ MSR jR and MSR dr are set to 0 for all inter¬ 
rupt types. Thus relocate is turned off for 
both instruction fetch and data access begin¬ 
ning with the first instruction following the 
acceptance of the interrupt. See Chapter 12, 
“Storage Control” on page 155. 

■ MSR sf bit is set to 1 in 64-bit implementa¬ 
tions and execution after the interrupt begins 
in 64-bit mode. This bit is reserved (not 
defined) in 32-bit implementations. 

5. Instruction fetch and execution resumes, using 
the new MSR value, at a location specific to the 
interrupt type. The location is determined by 
adding the interrupt vector's offset (see 
Figure 69 on page 193) to the base address 
determined by MSR, P (see Interrupt Prefix on 
page 149). For a Machine Check that occurs 
when MSR me = 0, the Checkstop state is entered 
(the machine stops executing instructions). See 
13.5.2, “Machine Check Interrupt” on page 194. 

Interrupts do not clear reservations obtained with 
Iwarx or Idarx. The operating system should do so at 
appropriate points, such as at process switch. 
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- Programming Note - 

In some implementations, any instruction fetch 
with MSR ir = 1, and any load or store with 
MSRor- 1, may have the side effect of modifying 
SRRs 0 and 1. 


— Programming Note - 

In general, at process switch, due to possible 
process interlocks and possible data availability 
requirements, the operating system needs to con¬ 
sider executing the following: 

■ stwcx., to clear the reservation if one is out¬ 
standing, to ensure that a Iwarx or Idarx in 
the “old” process is not paired with a stwcx. 
or stdcx. in the “new” process. 

■ sync , to ensure that all storage operations of 
an interrupted process are complete with 
respect to other processors before that 
process begins executing on another 
processor. 

■ isync or rfi, to ensure that the instructions in 
the “new” process execute in the “new” 
context. 


13.5 Interrupt Definitions 

Figure 68 below shows all the types of interrupts and 
the values assigned to the MSR for each. Figure 69 
shows the offset of the interrupt vector, for each 
interrupt type. 


Interrupt Type 

IP 

MSR bit 

ILE LE ME SF{} 

System Reset 

- 

- (1) 

- 

1 

Machine Check 

- 

- (1) 

0 

1 

Data Storage 

- 

- (1) 

- 

1 

instruction Storage 

- 

- (D 

- 

1 

External 

- 

- (1) 

- 

1 

Alignment 

- 

- (1) 

- 

1 

Program 

- 

- (1) 

- 

1 

FP Unavailable 

- 

- (1) 

- 

1 

Decrementer 

- 

- (D 

- 

1 

System Call 

- 

- (1) 

- 

1 

Trace 

- 

- (1) 

- 

1 

Floating-Point Assist 

- 

- (1) 

- 

1 


0 bit is set to 0 

1 bit is set to 1 


bit is not altered 
(1) bit is copied from ILE 

Defined bits not shown above (BE, DR, EE, FEO, 
FE1, FP, IR, POW, PR, Rl, and SE) are set to 0. 

Reserved bits are set as if written as 0. 


Figure 68. MSR Setting Due to Interrupt 


Offset (hex) 

Interrupt Type 

00000 

Reserved 

00100 

System Reset 

00200 

Machine Check 

00300 

Data Storage 

00400 

Instruction Storage 

00500 

External 

00600 

Alignment 

00700 

Program 

00800 

Floating-Point Unavailable 

00900 

Decrementer 

00A00 

Reserved 

00B00 

Reserved 

oocoo 

System Call 

00D00 

Trace 

00E00 

Floating-Point Assist 

00E10 

Reserved 

00FFF 

Reserved 

01000 

Reserved, implementation-specific 

02FFF 

(end of interrupt vector locations) 


Figure 69. Offset of Interrupt Vector by Interrupt 
Type 


- Programming Note - 

The operating system should manage MSR R | as 
follows: 

■ In the Machine Check and System Reset 
interrupt handlers, interpret SRR1 bit 62 {30} 
(where MSR R) is placed) as: 

— 0: interrupt is not recoverable 

— 1: interrupt is recoverable with respect to 
the processor 

■ In each interrupt handler, when enough state 
has been saved that a Machine Check or 
System Reset interrupt can be recovered 
from, set MSR r , to 1. 

■ In each interrupt handler, do the following just 
before returning. 

Set MSR R j to 0. 

— Set SRR0 and SRR1 to the values to be 
used by rfi. The new value of SRR1 
should have bit 62 {30} set to 1 (which 
will happen naturally if SRR1 is restored 
to the value saved there by the interrupt, 
because the interrupt handler will not be 
executing this sequence unless the inter¬ 
rupt is recoverable). 

— Execute rfi. 
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— Programming Note - 

Use of any of the locations shown as reserved 
risks incompatibility with future implementations. 


13.5.1 System Reset Interrupt 

System Reset begins with a System Reset interrupt. 

If the System Reset exception caused the processor 
state to be corrupted such that the content of SRRO 
or SRR1 are not valid or other processor resources 
are corrupt and would preclude a reliable restart, 
then the processor sets SRR1 bit 62 {30} (where 
MSR R | is normally placed) to 0, to indicate to the 
interrupt handler that the interrupt is not recoverable. 

The following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion that the processor would have 
attempted to execute next if no interrupt 
conditions were present. 

SRR1 

33:36 {1:4} Set to 0. 

42:47 {10:15} Set to 0. 

62 {30} Loaded from bit 62 {30} of the MSR if the 
processor is in a recoverable state, other¬ 
wise set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00100 from the base 
real address indicated by MSR (P . 

13.5.2 Machine Check Interrupt 

Machine Check interrupts are enabled when 
MSR ME -1. If MSR me =0 and a Machine Check 
occurs, the processor enters the Checkstop state. 

Disabled Machine Check (Checkstop State) 

When a processor is in Checkstop state, instruction 
processing is suspended and generally cannot be 
restarted without resetting the processor. Some 
implementations may freeze the content of all latches 
when entering Checkstop state so that the state of the 
processor can be analyzed as an aid in problem 
determination. 

Enabled Machine Check 

If the Machine Check exception caused the processor 
state to be corrupted such that the content of SRRO 
or SRR1 are not valid or other processor resources 
are corrupt and would preclude a reliable restart, 
then the processor sets SRR1 bit 62 {30} (where 


MSR r , is normally placed) to 0, to indicate to the 
interrupt handler that the interrupt is not recoverable. 

In some systems, the operating system may attempt 
to identify and log the cause of the Machine Check. If 
the exception that caused the Machine Check does 
not preclude continued execution (i.e., if SRR1 bit 62 
{30} is set to 1 for the interrupt handler), the 
processor must be able to continue execution at the 
Machine Check interrupt vector address. 

The following registers are set: 

SRRO Set on a “best effort" basis to the effective 
address of some instruction that was exe¬ 
cuting or was about to be executed when 
the Machine Check exception occurred. 
For further details see the Book IV, 
PowerPC Implementation Features docu¬ 
ment for the implementation. 

SRR1 See the Book IV, PowerPC Implementation 
Features document for the implementation. 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00200 from the base 
real address indicated by MSR, P . 

- Programming Note - 

On some implementations a Machine Check inter¬ 
rupt may occur due to referencing an invalid (non¬ 
existent) real address, either directly (with 
MSR dr = 0), or through an invalid translation. On 
such a system, execution of Data Cache Block set 
to Zero can cause a delayed Machine Check inter¬ 
rupt by introducing a block into the data cache 
that is associated with an invalid real address. A 
Machine Check interrupt could eventually occur 
when and if a subsequent attempt is made to 
store that block to main storage. 


13.5.3 Data Storage Interrupt 

A Data Storage interrupt occurs when no higher pri¬ 
ority exception exists and a data storage access 
cannot be performed for any of the following reasons: 

■ The instruction results in a Direct-Store Error 
exception. 

■ The effective address of a load, store, dcbi, dcbst, 
dcbf, dcbz, or icbi instruction cannot be trans¬ 
lated. 

■ The instruction is not supported for the type of 
storage addressed. (An interrupt may net occur 
for this condition; see Section 12.6.3, “Instructions 
not supported for T = 1” on page 174). 

■ The access violates storage protection. 

■ Execution of a eciwx or ecowx instruction is disal¬ 
lowed because EAR P = 0. 
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Such accesses can be generated by load/store type 
instructions (discussed in Part 1, “PowerPC User 
Instruction Set Architecture” on page 1), certain 
storage control instructions, certain cache control 
instructions (discussed in Part 2, “PowerPC Virtual 
Environment Architecture” on page 117), and the 
eciwx and ecowx instructions (discussed in Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141). 

If a stwcx. or stdcx. has an effective address for 
which a normal store would cause a Data Storage 
interrupt, but the processor does not have the reser¬ 
vation from Iwarx or Idarx, then it is implementation- 
dependent whether or not a Data Storage interrupt 
occurs. 

If a Move Assist instruction has a length of zero (in 
the XER), a Data Storage interrupt does not occur, 
regardless of the effective address. 

The interrupt cause is defined in the Data Storage 
Interrupt Status Register. These interrupts also use 
the Data Address Register. 

The following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion that caused the interrupt. 


33:36 {1:4} Set to 0. 

42:47 {10:15} Set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

DSISR 

0 Set to 1 if a load or store instruction 

results in a Direct-Store Error exception, 
otherwise 0. 

1 Set to 1 if the translation of an attempted 

access is not found in the hashed primary 
HTEG, or in the re-hashed secondary 
HTEG, or in the range of a DBAT register; 
otherwise 0. 

2:3 Set to 0. 

4 Set to 1 if a storage access is not per¬ 
mitted by the page or DBAT protection 
mechanism described on page 179, other¬ 
wise 0. 

5 Set to 1 if the access was due to an eciwx, 
ecowx, Iwarx, Idarx, stwcx., or stdcx. that 
addresses a direct-store segment (T=1 in 
Segment register or Segment Table Entry), 
or if the access was due to a Iwarx, Idarx, 
stwcx., or stdcx. that addresses Write 
Through storage; set to 0 otherwise. 

6 Set to 1 for a store operation and to 0 for a 
load operation. 


7:8 Set to 0. 

9 Reserved for DABR (see the Book IV, 
PowerPC Implementation Features docu¬ 
ment for the implementation). 

10 Set to 1 if the Segment Table Search fails 
to find a translation for the effective 
address, otherwise set to 0. 

11 Set to 1 if execution of a eciwx or ecowx 
instruction was attempted with EAR e = 0, 
otherwise set to 0. 

12:31 Set to 0. 

DAR Set to the effective address of a storage 
element as described in the following list. 

■ A byte in the first word accessed in 
the page that caused the Data Storage 
interrupt, for a byte, halfword, or word 
access to a non-direct-store segment. 

■ A byte in the first doubleword 

accessed in the page that caused the 
Data Storage interrupt, for a 
doubleword access to a non-direct- 
store segment. 

■ A byte in the first word accessed in 
the BAT area that caused the Data 
Storage interrupt, for a byte, halfword, 
or word access to a BAT area. 

■ A byte in the first doubleword 

accessed in the BAT area that caused 
the Data Storage interrupt, for a 
doubleword access to a BAT area. 

■ Any effective address in the range of 
storage being addressed, for a Direct- 
Store Error exception. 

Execution resumes at offset 0x00300 from the base 
real address indicated by MSR,p. 

13.5.4 Instruction Storage Interrupt 

An Instruction Storage interrupt occurs when no 
higher priority exception exists and an attempt to 
fetch the next instruction to be executed cannot be 
performed for any of the following reasons: 

■ The effective address cannot be translated. 

■ The fetch access is to a direct-store segment. 

■ The fetch access violates storage protection. 

Such accesses can only be generated by instruction 
fetches. The following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion that the processor would have 
attempted to execute next if no interrupt 
conditions were present (if the interrupt 
occurs on attempting to fetch a branch 
target, SRRO is set to the branch target 
address). 
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SRR1 

33 {1> Set to 1 if the translation of an attempted 

access is not found in the hashed primary 
HTEG, or in the re-hashed secondary 
HTEG, or in the range of an I BAT register; 
otherwise 0. 

34 {2} Set to 0. 

35 {3} Set to 1 if the fetch access was to a direct- 

store segment (T=1 in Segment Register 
or Segment Table Entry); set to 0 other¬ 
wise. 

36 {4} Set to 1 if a storage access is not per¬ 

mitted by the page or I BAT protection 
mechanism described on page 179, other¬ 
wise 0. 

42 {10} Set to 1 if the Segment Table Search fails 
to find a translation for the effective 
address, otherwise set to 0. 

43:47 {11:15} Set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00400 from the base 
real address indicated by MSR, P . 

13.5.5 External Interrupt 

An External interrupt occurs when no higher priority 
exception exists, an External interrupt exception is 
presented to the interrupt mechanism, and MSR EE =1. 
The occurrence of the interrupt does not cancel the 
request. 

The following registers are set: 

SRR0 Set to the effective address of the instruc¬ 
tion that the processor would have 
attempted to execute next if no interrupt 
conditions were present. 

SRR1 

33:36 {1:4} Set to 0. 

42:47 {10:15} Set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00500 from the base 
real address indicated by MSR (P . 


13.5.6 Alignment Interrupt 

An Alignment interrupt occurs when no higher priority 
exception exists and the implementation cannot 
perform a storage access for one of the reasons listed 
below. The term “protection boundary,” used below, 
refers to the boundary between protection domains. 
A protection domain is a direct-store segment, a block 
of storage defined by a BAT entry, or a 4K block of 
storage defined by a Page Table entry. Protection 
domains are defined only when DR = 1. 

■ The operand of a floating-point load or store is 
not word-aligned, for any storage class. 

■ The operand of a fixed-point doubleword load or 
store is not word-aligned, for any storage class. 

■ The operand of Imw, stmw, Iwarx, or stwcx. is 
not word-aligned, or the operand of Idarx or 
stdcx. is not doubleword-aligned, for any storage 
class. 

■ The operand of a floating-point load or store is in 
a direct-store segment (T=1). 

■ The operand of an elementary or string load or 
store crosses a protection boundary. 

■ The operand of Imw or stmw crosses a segment 
or BAT boundary. 

■ The operand of Data Cache Block set to Zero 
( dcbz ) is in a page that is Write Through or 
Caching Inhibited, for a virtual mode access. 

In all cases above, an implementation may correctly 
do the operation and not cause an Alignment inter¬ 
rupt. Details can be found in the Book IV, PowerPC 
Implementation Features document for the implemen¬ 
tation. 

The following registers are set: 

SRR0 Set to the effective address of the instruc¬ 
tion that caused the interrupt. 

SRR1 

33:36 {1:4} Set to 0. 

42:47 {10:15} Set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

DSISR 

0:11 Set to 0. 

12:13 Set to bits 30:31 of the instruction if 
DS-form. 

Set to ObOO if D- or X-form. (Set to ObOO on 
32-bit implementations.) 

14 Set to 0. 

15:16 Set to bits 29:30 of the instruction if X-form. 

Set to ObOO if D- or DS-form. 

17 Set to bit 25 of the instruction if X-form. 

Set to bit 5 of the instruction if D- or 
DS-form. 
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13.5.7 Program Interrupt 


18:21 Set to bits 21:24 of the instruction if X-form. 
Set to bits 1:4 of the instruction if D- or 
DS-form. 

22:26 Set to bits 6:10 of the instruction 
(RT/RS/FRT/FRS), except undefined for 
dcbz. 

27:31 Set to bits 11:15 of the instruction (RA) for 
update form instructions; set to either bits 
11:15 of the instruction or to any register 
number not in the range of registers loaded 
by a valid form instruction, for Imw, Iswi, 
and Iswx ; undefined for other instructions. 

DAR Set to the effective address of the data 
access as computed by the instruction 
causing the alignment exception. 

For an X-form Load or Store, it is acceptable to set 
the DSISR to the same value that would have 
resulted if the corresponding D- or DS-form instruc¬ 
tion had caused the interrupt. Similarly, for a D- or 
DS-form Load or Store, it is acceptable to set the 
DSISR to the value that would have resulted for the 
corresponding X-form instruction. For example, an 
unaligned Iwax (that crosses a protection boundary) 
would normally, following the description above, 
cause the DSISR to be set to binary: 

000000000000 00 0 01 0 0101 ttttt ????? 

where “ttttt” denotes the RT field, and “?????” 
denotes undefined bits. However, it is acceptable if it 
causes the DSISR to be set as for Iwa, which is 

000000000000 10 0 00 0 1101 ttttt ????? 

If there is no corresponding alternate form instruction 
(e.g., for Iwaux), the value described above must be 
set in the DSISR. 

The instruction pairs that may use the same DSISR 
value are: 

lbz/lbzx Ibzu/lbzux lhz/lhzx lhzu/lhzux 

lha/lhax lhau/lhaux Iwz/lwzx Iwzu/lwzux 

lwa/lwax 1 d/1dx 1 du/1 dux 

stb/stbx stbu/stbux sth/sthx sthu/sthux 

stw/stwx stwu/stwux std/stdx stdu/stdux 

lfs/lfsx Ifsu/lfsux 1fd/1fdx Ifdu/lfdux 

stfs/stfsx stfsu/stfsux stfd/stfdx stfdu/stfdux 

Execution resumes at offset 0x00600 from the base 
real address indicated by MSR iP . 

- Programming Note - 

Software should not attempt to obtain a reserva¬ 
tion for an unaligned Iwarx or Idarx, nor to simu¬ 
late an unaligned stwcx. or s tdcx.. 


A Program interrupt occurs when no higher priority 
exception exists and one or more of the following 
exceptions arises during execution of an instruction: 

Floating-Point Enabled Exception 
A Floating-Point Enabled Exception type Program 
interrupt is generated when the expression 

(MSR feo I MSR FE1 ) & FPSCR fex 

is 1. FPSCR fex is turned on by the execution of a 
floating-point instruction that causes an enabled 
exception or by the execution of a “Move to 
FPSCR” type instruction that results in both an 
exception bit and its corresponding enable bit 
being 1. 

Illegal Instruction 

An Illegal Instruction type Program interrupt is 
generated when execution is attempted of an 
instruction with an illegal opcode or an illegal 
combination of opcode and extended opcode 
fields, or when execution is attempted of an 
optional instruction that is not provided by the 
implementation (with the exception of optional 
instructions that are treated as no-ops). Also, 
implementations are allowed to generate this 
interrupt for any invalid form instructions. 

See the Part 1, “PowerPC User Instruction Set 
Architecture” on page 1 appendix “Incompatibili¬ 
ties with the Power Architecture” regarding 
moving to and from the MO and Decrementer 
registers. 

Privileged Instruction 

A Privileged Instruction type Program interrupt is 
generated when the execution of a privileged 
instruction is attempted and MSR pr = 1. Some 
implementations may generate this interrupt for 
mtspr or mfspr with an invalid SPR field if spr 0 = 1 
and MSR pr = 1. 

Trap 

A Trap type Program interrupt is generated when 
any of the conditions specified in a Trap instruc¬ 
tion is met. 

The following registers are set: 

SRR0 For all Program interrupts except a 
Floating-Point Enabled Exception when in 
one of the Imprecise modes, set to the 
effective address of the instruction that 
caused the Program interrupt. 

For an Imprecise Mode Floating-Point 
Enabled Exception, set to the effective 
address of the excepting instruction or to 
the effective address of some subsequent 
instruction. If it points to a subsequent 
instruction, that instruction has not been 
executed. If a subsequent instruction is 
Synchronize (sync) or Instruction Synchro¬ 
nize (/sync), SRR0 will not point more than 
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four bytes beyond the sync or isync 
instruction. 

If FPSCR fex =1 but Floating-Point Enabled 
Exception interrupt is disabled by having 
both MSR FE0 and MSR FE1 = 0, a Floating- 
Point Enabled Exception interrupt will occur 
prior to or at the next synchronizing event 
if these MSR bits are altered with any 
instruction that can set the MSR so that 
the expression 

(MSR feo I MSR FE1 ) & fpscr fex 

is 1. When this occurs, SRRO is loaded 
with the address of the instruction that 
would have executed next, not with the 
address of the instruction that modified the 
MSR causing the interrupt. 

SRR1 

33:36 {1:4} Set to 0. 

42 {10} Set to 0. 

43 {11} Set to 1 for a Floating-Point Enabled Excep¬ 

tion type Program interrupt, otherwise 0. 

44 {12} Set to 1 for an Illegal Instruction type 

Program interrupt, otherwise 0. 

45 {13} Set to 1 for a Privileged Instruction type 

Program interrupt, otherwise 0. 

46 {14} Set to 1 for a Trap type Program interrupt, 

otherwise 0. 

47 {15} Set to 0 if SRRO contains the address of 

the instruction causing the exception, and 
to 1 if SRRO contains the address of a sub¬ 
sequent instruction. 

Others Loaded from the MSR. 

Only one of bits 43:46 {11:14} can be set to 
1 . 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00700 from the base 
real address indicated by MSR iP . 

13.5.8 Floating-Point Unavailable 
Interrupt 

A Floating-Point Unavailable interrupt occurs when no 
higher priority exception exists, an attempt is made to 
execute a floating-point instruction (including floating¬ 
point loads, stores, and moves), and MSR FP = 0. 

The following registers are set: 


SRRO Set to the effective address of the instruc¬ 
tion that caused the interrupt. 

SRR1 

33:36 {1:4} Set to 0. 

42:47 {10:15} Set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00800 from the base 
real address indicated by MSR jP . 

13.5.9 Decrementer Interrupt 

A Decrementer interrupt occurs when no higher pri¬ 
ority exception exists, the Decrementer exception 
exists, and MSR EE = 1. The occurrence of the inter¬ 
rupt cancels the request. 

The following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion that the processor would have 
attempted to execute next if no interrupt 
conditions were present. 

SRR1 

33:36 {1:4} Set to 0. 

42:47 {10:15} Set to 0. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

Execution resumes at offset 0x00900 from the base 
real address indicated by MSR )P . 

13.5.10 System Call Interrupt 

A System Call interrupt occurs when a System Call 
instruction is executed. 

The following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion following the System Call instruction. 

SRR1 

32:47 {0:15} Undefined. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

Execution resumes at offset OxOOCOO from the base 
real address indicated by MSR jP . 
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13.5.11 Trace Interrupt 

The Trace interrupt may optionally be implemented. 

If implemented, a Trace interrupt occurs when no 
higher priority exception exists and either MSR se =1 
and any instruction except iff is successfully com¬ 
pleted, or MSR be =1 and a branch instruction is com¬ 
pleted. 

The following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion that the processor would have 
attempted to execute next if no interrupt 
conditions were present. 

SRR1 

33:36 and 42:47 {1:4 and 10:15) See the Book IV, 
PowerPC Implementation Features docu¬ 
ment for the implementation. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

For further details see the Book IV, PowerPC Imple¬ 
mentation Features document for the implementation. 

Execution resumes at offset OxOODOO from the base 
real address indicated by MSR, P . 

13.5.12 Floating-Point Assist 
Interrupt 

The Floating-Point Assist interrupt may optionally be 
implemented. Its purpose is to allow software assist¬ 
ance for relatively infrequent and complex floating¬ 
point operations such as computations involving 
denormalized numbers. 

If implemented, the following registers are set: 

SRRO Set to the effective address of the instruc¬ 
tion that caused the Floating-Point Assist 
interrupt. 

SRR1 

33:36 and 42:47 {1:4 and 10:15} See the Book IV, 
PowerPC Implementation Features docu¬ 
ment for the implementation. 

Others Loaded from the MSR. 

MSR See Figure 68 on page 193. 

For further details see the Book IV, PowerPC Imple¬ 
mentation Features document for the implementation. 

Execution resumes at offset OxOOEOO from the base 
real address indicated by MSR, P . 


13.6 Partially Executed 
Instructions 

The architecture permits certain instructions to be 
partially executed when an Alignment or Data Storage 
interrupt occurs, or an imprecise interrupt is forced by 
an instruction that causes an Alignment or Data 
Storage exception. These are: 

1. Load Multiple or Load String that causes an 
Alignment or Data Storage interrupt: Some regis¬ 
ters in the range of registers to be loaded may 
have been loaded. 

2. Store Multiple or Store String that causes an 
Alignment or Data Storage interrupt: Some bytes 
of storage in the range addressed may have been 
updated. 

3. An elementary (non-multiple and non-string) store 
that causes an Alignment or Data Storage inter¬ 
rupt: Some bytes just before the boundary may 
have been updated. If the instruction normally 
alters CRO (stwcx., stdcx.), CRO is set to an unde¬ 
fined value. For update forms, the update reg¬ 
ister (RA) is not altered. 

4. A floating-point load that causes an Alignment or 
Data Storage interrupt: the target register may 
be altered. For update forms, the update register 
(RA) is not altered. 

5. A load or store to a direct-store segment that 
causes a Data Storage interrupt due to a Direct- 
Store Error exception: Some of the associated 
address/data transfers may not have been initi¬ 
ated. All initiated transfers are completed before 
the exception is reported, and the non-initiated 
transfers are aborted. Thus the instruction com¬ 
pletes before the Data Storage interrupt occurs. 

In the cases above, the questions of how many regis¬ 
ters and how much storage is altered are implemen¬ 
tation-, instruction-, and boundary-dependent. 
However, storage protection is not violated. Further¬ 
more, if some of the data accessed is in direct-store 
(T=1), and the instruction is not supported for direct- 
store, the locations in direct-store are not accessed. 

In the following situation, partial execution is not 
allowed (this preserves restartability): 

An elementary (non-multiple and non-string) 
fixed-point load that causes an Alignment or Data 
Storage interrupt: the target register is not 
altered. For update forms, the update register 
(RA) is not altered. 
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13.7 Exception Ordering 

Since multiple exceptions can exist at the same time 
and the architecture does not provide for reporting 
more than one interrupt at a time, the generation of 
more than one interrupt is prohibited. Also some 
exceptions would be lost if they were not recognized 
and handled when they occur. For example, if an 
external interrupt was generated when a data storage 
exception existed, the data storage exception would 
be lost. If the data storage exception was caused by 
a Store Multiple instruction that spanned a page 
boundary and the exception was a result of 
attempting to access the second page, the store could 
have modified locations in the first page even though 
it appeared that the Store Multiple instruction was 
never executed. 

In addition, the architecture defines imprecise inter¬ 
rupts that must be recoverable, cannot be lost, and 
can occur at any time with respect to the executing 
instruction stream. Some of the maskable and non¬ 
maskable exceptions are persistent and can be 
deferred. The following exceptions persist even 
though some other interrupt is generated: 

■ Floating-Point Enabled Exceptions 

■ External 

■ Decrementer 

For the above reasons, all exceptions are prioritized 
with respect to other exceptions that may exist at the 
same instant to prevent the loss of any exception that 
is not persistent. Some exceptions cannot exist at the 
same instant as some others. 

13.7.1 Unordered Interrupt 
Conditions 

The exceptions listed here are unordered, meaning 
that they may occur at any time regardless of the 
state of the interrupt mechanism. These exceptions 
must be recognized and processed when presented. 

1. System Reset 

2. Machine Check 

All other interrupts are ordered with respect to the 
interrupt mechanism resources. 


13.7.2 Ordered Exceptions 

The exceptions described here are ordered, meaning 
that only one can be reported. However, the single 
ordered exception that can be reported may exist in 
concert with unordered exceptions. Ordered excep¬ 
tions may or may not be instruction-caused. The two 
lists identify the ordered interrupts by type. The 
order within the lists does not imply priority but only 
lists the possible exceptions that may be reported. 

System-caused or Imprecise 

1. Program 

- Imprecise Mode Floating-Point Enabled Exception 

2. External 

3. Decrementer 

Instruction-caused and Precise 

1. Instruction Storage 

2. Program 

- Illegal Instruction 

- Privileged Instruction 

3. Function Dependent 

3.a Fixed-Point 

la Program - Trap 

1 b System Call 
lc.1 Alignment 
1c.2 Datastorage 

2 Trace (if implemented) 

3.b Floating-Point 
1 FP Unavailable 
2a Program 

- Precise Mode Floating-Point Enabled Excep'n 
2b Floating-Point Assist (if implemented) 

2c.1 Alignment 
2c.2 Data Storage 

3 Trace (if implemented) 

For implementations that execute multiple instructions 
in parallel using pipeline or super-scalar techniques, 
or combinations of these, it can be difficult to under¬ 
stand the ordering of exceptions. To understand this 
ordering it is useful to consider a model in which an 
instruction is fetched, decoded, and then executed. In 
this model, the exceptions a single instruction would 
generate are in the order shown in the list of 
instruction-caused exceptions. Exceptions with dif¬ 
ferent numbers have different ordering. Exceptions 
with the same numbering but different lettering are 
mutually exclusive and cannot be caused by the same 
instruction. 

Even on processors that are capable of executing 
several instructions simultaneously, or out of order, 
instruction-caused interrupts (precise and imprecise) 
occur in program order. 
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13.8 Interrupt Priorities 

This section describes the relationship of nonmask¬ 
able, maskable, precise, and imprecise interrupts. In 
the following descriptions, the interrupt mechanism 
waiting for all possible exceptions to be reported 
includes only exceptions caused by previously initi¬ 
ated instructions (e.g. it does not include waiting for 
the Decrementer to step through zero). The excep¬ 
tions are listed in order of highest to lowest priority. 

1. System Reset 

System Reset exception has the highest priority 
of all exceptions. If this exception exists, the 
interrupt mechanism ignores all other exceptions 
and generates a System Reset interrupt. 

Once the System Reset interrupt is generated, no 
nonmaskable interrupts are generated due to 
exceptions caused by instructions issued prior to 
the generation of this interrupt. 

2. Machine Check 

Machine Check exception is the second highest 
priority exception. If this exception exists and a 
System Reset exception does not exist, the inter¬ 
rupt mechanism ignores all other exceptions and 
generates a Machine Check interrupt. 

Once the Machine Check interrupt is generated, 
no nonmaskable interrupts are generated due to 
exceptions caused by instructions issued prior to 
the generation of this interrupt. 

3. Instruction Dependent 

This exception is the third highest priority excep¬ 
tion. When this exception is created, the interrupt 
mechanism waits for all possible Imprecise 
exceptions to be reported. It then generates the 
appropriate ordered interrupt if no higher priority 
interrupt exception exists when the interrupt is to 
be generated. Within this category a particular 
instruction may present more than a single 
exception. When this occurs, those exceptions 
are ordered in priority as indicated in the fol¬ 
lowing lists. 

A. Fixed-Point Loads and Stores 

a. Alignment 

b. Data Storage 

c. Trace (if implemented) 

B. Floating-Point Loads and Stores 

a. Floating-Point Unavailable 

b. Alignment 

c. Data Storage 

d. Trace (if implemented) 

C. Other Floating-Point Instructions 

a. Floating-Point Unavailable 

b. Program - Precise Mode Floating-Point 
Enabled Exception 

c. Floating-Point Assist (if implemented) 


d. Trace (if implemented) 

Not all floating-point instructions can cause 
enabled exceptions. 

D. rfi and mtmsr 

a. Program - Privileged Instruction 

b. Program - Precise Mode Floating-Point 
Enabled Exception 

c. Trace (if implemented) 

If the MSR bits FEO and FE1 are set such that 
Precise Mode Floating-Point Enabled Excep¬ 
tion interrupts are enabled and the 
FPSCR(FEX) bit is set, a Program interrupt 
will result prior to or at the next synchro¬ 
nizing event. 

The Trace interrupt should not be generated 
after an rfi. 

E. Other exceptions 

These exceptions are mutually exclusive and 
have the same priority: 

■ Program - Trap 

■ System Call 

■ Program - Privileged Instruction 

■ Program - Illegal Instruction 

F. Instruction Storage 

This exception has the lowest priority in this 
category. It is only recognized when all 
instructions prior to the instruction causing 
this exception appear to have completed and 
that instruction is to be executed. 

The priority of this interrupt is specified for 
completeness and to ensure that it is not 
given more favorable treatment. It is accept¬ 
able for an implementation to treat this inter¬ 
rupt as though it had a lower priority. 

4. Program - Imprecise Mode Floating-Point Enabled 
Exception 

This exception is the fourth highest priority 
exception. When this exception is created, the 
interrupt mechanism waits for all other possible 
exceptions to be reported. It then generates this 
interrupt if no higher priority exception exists 
when the interrupt is to be generated. 

5. External 

This exception is the fifth highest priority excep¬ 
tion. When this exception is created, the interrupt 
mechanism waits for ail other possible exceptions 
to be reported. It then generates this interrupt if 
no higher priority exception exists when the inter¬ 
rupt is to be generated. 

6. Decrementer 

This exception is the lowest priority exception. 
When this exception is created, the interrupt 
mechanism waits for all other possible exceptions 
to be reported. It then generates this interrupt if 
no higher priority exception exists when the inter¬ 
rupt is to be generated. 
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Chapter 14. Timer Facilities 


14.1 Overview 

The Time Base and the Decrementer provide timing 
functions for the system. Specific instructions are 
provided for reading and writing the Time Base, while 
the Decrementer is manipulated as an SPR. Both are 
volatile resources and must be initialized during start 
up. 

Time Base (TB) 

The Time Base provides a long-period counter 
driven by an implementation-dependent fre¬ 
quency. 

Decrementer (DEC) 

The Decrementer, a counter that is updated at 
the same rate as the Time Base, provides a 
means of signalling an interrupt after a specified 
amount of time has elapsed unless 


The Time Base runs continuously when powered on. 
There is no automatic initialization of the Time Base 
to a known value when the CPU is powered up; 
system software must perform this initialization if the 
value of the Time Base at any instant (rather than the 
difference between two values of the Time Base at 
different instants) is important. 

The Time Base increments until its value becomes 
OxFFFF_FFFF_FFFF_FFFF (2 s4 - 1). At the next incre¬ 
ment, its value becomes 0x0000_0000_0000_0000. 
There is no interrupt or other indication when this 
occurs. 

The period of the Time Base depends on the driving 
frequency. As an order of magnitude example, 
suppose that the CPU clock is 100 MHz and that the 
Time Base is driven by this frequency divided by 32. 
Then the period of the Time Base would be 


■ the Decrementer is altered in the interim, or 

■ the Time Base update frequency changes. 


TjQ — 


2 64 x 32 
100 MHz 


= 5.90 x 10 12 


seconds 


14.2 Time Base 


The Time Base (TB) is a 64-bit register (see 
Figure 70) containing a 64-bit unsigned integer that is 
incremented periodically. Each increment adds 1 to 
the low-order bit (bit 63). The frequency at which the 
counter is updated is implementation-dependent and 
need not be constant over long periods of time. 


TBU 


TBL 


o 


32 


63 


Field Description 

TBU Upper 32 bits of Time Base 

TBL Lower 32 bits of Time Base 

Figure 70. Time Base 


which is approximately 187,000 years. 

The PowerPC Architecture does not specify a relation¬ 
ship between the frequency at which the Time Base is 
updated and other frequencies, such as the CPU clock 
or bus clock, in a PowerPC system. The Time Base 
update frequency is not required to be constant. 
What is required, so that system software can keep 
time of day and operate interval timers, is: 

■ The system provides an (implementation- 
dependent) interrupt to software whenever the 
update frequency of the Time Base changes, plus 
a means to determine what the current update 
frequency is, or 

■ The update frequency of the Time Base is under 
the control of the system software. 
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— Programming Notes- 

Assuming that the operating system initializes the 
Time Base on power-on to some reasonable value 
and that the update frequency of the Time Base is 
constant, the Time Base can be used as a source 
of values that increase at a constant rate, such as 
for time stamps in trace entries. 

Even if the update frequency is not constant, 
values read from the Time Base will be 
monotonically increasing. If a trace entry is 
recorded each time the update frequency 
changes, the sequence of Time Base values can 
be post-processed to become actual time values. 

On an implementation that performs speculative 
execution, the Time Base may be read arbitrarily 
far “ahead” of the point at which it appears in the 
instruction stream. If it is important that this not 
occur, a context synchronizing operation such as 
the isync instruction should be placed imme¬ 
diately before the instructions that read the Time 
Base. 

See the description of the Time Base in Part 2, 
“PowerPC Virtual Environment Architecture” on 
page 117 for ways to compute time of day in 
POSIX format from the Time Base. 


14.2.1 Writing the Time Base 


— Programming Note - 

The instructions for writing the Time Base are 
implementation- and mode-independent. Thus 
code written to set the Time Base on a 32-bit 
implementation will work correctly on a 64-bit 
implementation running in either 64- or 32-bit 
mode. 


14.3 Decrementer 

The Decrementer (DEC) is a 32-bit decrementing 
counter that provides a mechanism for causing a 
Decrementer Interrupt after a programmable delay. 


DEC 

0 31 

Figure 71. Decrementer 


The Decrementer is driven by the same frequency as 
the Time Base. The period of the Decrementer will 
depend on the driving frequency, but if the same 
values are used as given above for the Time Base 
(section Chapter 8), and if the Time Base update fre¬ 
quency is constant, the period would be 


Tdec — 


2 32 x 32 
100 MHz 


= 1.37 x 10 3 


seconds 


Writing the Time Base is privileged; reading the Time 
Base is not privileged; it is discussed in Part 2, 
“PowerPC Virtual Environment Architecture” on 
page 117. 

It is not possible to write the entire 64-bit Time Base 
in a single instruction. The mttbl and mttbu extended 
mnemonics write the lower and upper halves of the 
Time Base (TBL and TBU), respectively, preserving 
the other half. These are extended mnemonics for 
the mtspr instruction; see page 231. 

The Time Base can be written by a sequence such as: 

Iwz Rx,upper # load 64-bit value for 
Iwz Ry,lower # TB into Rx and Ry 
li Rz,0 

mttbl Rz # force TBL to 0 
mttbu Rx # set TBU 
mttbl Ry # set TBL 

Loading 0 into TBL prevents the possibility of a carry 
from TBL to TBU while the Time Base is being initial¬ 
ized. 


which is approximately 23 minutes. 

The Decrementer counts down, causing an interrupt 
(unless masked) when passing through zero. The 
Decrementer must be implemented such that the fol¬ 
lowing requirements are satisfied: 

1. The operation of the Time Base and the 
Decrementer are coherent, i.e. the counters are 
driven by the same fundamental time base. 

2. Loading a GPR from the Decrementer shall have 
no effect on the Decrementer. 

3. Storing a GPR to the Decrementer shall replace 
the value in the Decrementer with the value in 
the GPR. 

4. Whenever bit 0 of the Decrementer changes from 
0 to 1, an interrupt request is signalled. If mul¬ 
tiple Decrementer Interrupt requests are received 
before the first can be reported, only one inter¬ 
rupt is reported. The occurrence of a 
Decrementer Interrupt cancels the request. 

5. If the Decrementer is altered by software and the 
content of bit 0 is changed from 0 to 1, an inter¬ 
rupt request is signaled. 
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— Programming Note - 

In systems that change the Time Base update fre¬ 
quency for purposes such as power management, 
the Decrementer input frequency will also change. 
Software must be aware of this in order to set 
interval timers. 

On an implementation that performs speculative 
execution, the Decrementer may be read arbi¬ 
trarily far “ahead” of the point at which it appears 
in the instruction stream. If it is important that 
this not occur, a context synchronizing operation 
such as the isync instruction should be placed 
immediately before the instruction that reads the 
Decrementer. 


14.3.1 Writing and Reading the 
Decrementer 

The content of the Decrementer can be read or 
written using the mfspr and mtspr instructions, both 
of which are privileged when they refer to the 
Decrementer. Using an extended mnemonic (see 
page 231), the Decrementer may be written from reg¬ 
ister GPR Rx with: 

mtdec Rx 

- Programming Note - 

If the execution of this instruction causes bit 0 of 
the Decrementer to change from 0 to 1, an inter¬ 
rupt request is signalled. 


The Decrementer may be read into GPR Rx with: 
mfdec Rx 

Copying the Decrementer to a GPR has no effect on 
the Decrementer content or interrupt mechanism. 
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Appendix A. Optional Instructions 


The instructions described in this appendix are 
optional. If an instruction is implemented that 
matches the semantics of an instruction described 
here, the implementation should be as specified here. 


The optional instructions are divided into two groups. 
Additional groups may be defined in the future. 

■ General Purpose group: fsqrt and fsqrts. 

■ Graphics group: stfiwx, fres, frsqrte, and fee/. 

If an implementation claims to support a given group, 
it must implement all the instructions in the group. 
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A.1 Floating-Point Processor Instructions 


A.1.1 Floating-Point Store Instruction 

Byte ordering on PowerPC is Big-Endian by default. 
See Appendix D, “Little-Endian Byte Ordering” on 
page 233 for the effects of operating a PowerPC 
system with Little-Endian byte ordering. 


Store Floating-Point as Integer Word 
Indexed X-form 

stfiwx FRS.RA.RB 


31 

FRS 

RA 

RB 

983 

/ 

0 

6 

ii 

16 

21 

31 


if RA = 0 then b «- 0 
else b «- (RA) 

EA «- b + (RB) 

MEM(EA, 4) «- (FRS) 32;63 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

The contents of the low-order 32 bits of register FRS 
are stored, without conversion, into the word in 
storage addressed by EA. 

If the contents of register FRS were produced, either 
directly or indirectly, by a Load Floating-Point Single 
instruction, a single-precision arithmetic instruction, 
or frsp, then the value stored is undefined. (The con¬ 
tents of register FRS are produced directly by such an 
instruction if FRS is the target register for the instruc¬ 
tion. The contents of register FRS are produced indi¬ 
rectly by such an instruction if FRS is the final target 
register of a sequence of one or more Floating-Point 
Move instructions, with the input to the sequence 
having been produced directly by such an instruction.) 

Special Registers Altered: 

None 
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A.1.2 Floating-Point Arithmetic Instructions 


Floating Square Root [S/7ig/e] Floating Reciprocal Estimate Single 

A-form A-form 

fsqrt FRT.FRB (Rc = 0) fres FRT.FRB (Rc = 0) 

fsqrt. FRT.FRB (Rc = 1) fres. FRT.FRB (Rc = 1) 

63 FRT III FRB III 22 Rc 59 FRT III FRB III 24 Rc 

0 6 11 16 21 26 31 0 6 11 16 21 26 31 


fsqrts FRT.FRB (Rc = 0) 

fsqrts. FRT.FRB (Rc = 1) 


59 

FRT 

III 

FRB 

III 

22 

Rc 

0 

6 

11 

16 

21 

26 

31 


The square root of the floating-point operand in reg¬ 
ister FRB is placed into register FRT. 

If the most significant bit of the resultant significand is 
not a one the result is normalized. The result is 
rounded to the target precision under control of the 
Floating-Point Rounding Control field RN of the FPSCR 
and placed into register FRT. 

Operation with various special values of the operand 
is summarized below. 


Operand 

Result 

Exception 

.00 

QNaN 1 

VXSQRT 

< 0 

QNaN 1 

VXSQRT 

-0 

-0 

None 

+00 

+00 

None 

SNaN 

QNaN 1 

VXSNAN 

QNaN 

QNaN 

None 

^o result if FPSCR^ 

= 1 . 


FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve = 1. 


A single-precision estimate of the reciprocal of the 
floating-point operand in register FRB is placed into 
register FRT. The estimate placed into register FRT 
is correct to a precision of one part in 256 of the 
reciprocal of (FRB). 

Operation with various special values of the operand 
is summarized below. 


Operand 

Result 

Excepti 

.00 

-0 

None 

-0 

-ool 

ZX 

+0 

+001 

ZX 

+00 

+0 

None 

SNaN 

QNaN 2 

VXSNAN 

QNaN 

QNaN 

None 

^o result if FPSCR ZE 

= 1 . 

2 No result if FPSCR VE 

= 1 . 


FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve =1 and Zero Divide Exceptions when 
FPSCR ze = 1. 

Special Registers Altered: 

FPRF FR (undefined) FI (undefined) 

FX OX UX ZX 
VXSNAN 

CR1 (if Rc = 1) 


Special Registers Altered: 

FPRF FR FI 
FX XX 

VXSNAN VXSORT 

CR1 (if Rc = 1) 
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Floating Reciprocal Square Root 
Estimate A-form 


A.1.3 Floating-Point Select 
Instruction 


frsqrte FRT, FRB (Rc = 0) 

frsqrte. FRT, FRB (Rc = 1) 


63 

FRT 

III 

FRB 

III 

26 

Rc 

0 

6 

11 

16 

21 

26 

31 


Floating Select A-form 

fsel FRT,FRA,FRC,FRB (Rc = 0) 

fsel. FRT,FRA,FRC,FRB (Rc=1) 


A double-precision estimate of the reciprocal of the 
square root of the floating-point operand in register 
FRB is placed into register FRT. The estimate placed 
into register FRT is correct to a precision of one part 
in 32 of the reciprocal of the square root of (FRB). 

Operation with various special values of the operand 
is summarized below. 


Operand 

Result 

Exception 

.00 

QNaN 2 

VXSQRT 

< 0 

QNaN 2 

VXSQRT 

-0 

-ool 

ZX 

+0 

+ 00 I 

ZX 

+00 

+0 

None 

SNaN 

QNaN 2 

VXSNAN 

QNaN 

QNaN 

None 


’No result if FPSCR ZE - 1. 
2 No result if FPSCR VE = 1. 


FPSCR fprf is set to the class and sign of the result, 
except for Invalid Operation Exceptions when 
FPSCR ve = 1 and Zero Divide Exceptions when 
FPSCR 2E =1. 

Special Registers Altered: 

FPRF FR (undefined) FI (undefined) 

FX ZX 

VXSNAN VXSORT 

CR1 (if Rc = 1) 


63 

FRT 

FRA 

FRB 

FRC 

23 

Rc 

0 

6 

ii 

16 

21 

26 

31 


if (FRA) 2 0.0 then FRT <- (FRC) 
else FRT <- (FRB) 

The floating-point operand in register FRA is com¬ 
pared to the value zero. If the operand is greater 
than or equal to zero, register FRT is set to the con¬ 
tents of register FRC. If the operand is less than zero 
or is a NaN, register FRT is set to the contents of reg¬ 
ister FRB. The comparison ignores the sign of zero 
(i.e., regards +0 as equal to —0). 

Special Registers Altered: 

CR1 (if Rc = 1) 

- Programming Note - 

Examples of uses of this instruction can be found 
in Appendices E.3, “Floating-Point Conversions” 
on page 248, and E.4, “Floating-Point Selection” 
on page 251. 

Warning: Care must be taken in using fsel if IEEE 
compatibility is required, or if the values being 
tested can be NaNs or infinities; see Section E.4.4, 
“Notes” on page 251. 
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Appendix B. Suggested Floating-Point Models 


B.1 Floating-Point Round to Single-Precision Model 

The following describes algorithmically the operation of the Floating Round to Single-Precision instruction. 


If (FRB) ri1 < 897 and (FRB), 63 > 0 then 
Do 

If FPSCR ue = 0 then goto Disabled Exponent Underflow 
If FPSCR ue = 1 then goto Enabled Exponent Underflow 

End 

If (FRB) 1 . 11 >1150 and (FRB) V11 < 2047 then 
Do 

If FPSCR oe = 0 then goto Disabled Exponent Overflow 
If FPSCR oe = 1 then goto Enabled Exponent Overflow 

End 

If (FRB) 1:11 > 896 and (FRB) 1:11 < 1151 then goto Normal Operand 

If (FRB), :63 = 0 then goto Zero Operand 

If (FRB) V11 - 2047 then 
Do 

If (FRB) 12:63 = 0 then goto Infinity Operand 
If (FRB) 12 = 1 then goto QNaN Operand 
If (FRB) 12 = 0 and (FRB) 13 63 > 0 then goto SNaN Operand 
End 
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Disabled Exponent Underflow. 

sign «- (FRB ) 0 
If (FRB ) V11 = 0 then 
Do 

exp *- 1022 

frac <- ObO || (FRB ) 12;63 
End 

If (FRB) 1n >0 then 
Do 

exp t- (FRB ) 1;11 — 1023 
frac •*- Obi || (FRB ) 1263 
End 

Denormalize operand: 

G || R || X 4 - ObOOO 
Do while exp < —126 
exp 4 - exp + 1 

frac || G || R || X 4 - ObO || frac || G || (R | X) 

End 

FPSCR UX 4 - frac 24 52 II G || R || X > 0 
Round single(sign,exp,frac,G,R,X) 

FPSCRxx 4 - FPSCRxx | FPSCR f , 

If frac = 0 then 
Do 

FRTqq 4 - sign 
FRT 0 i;63 0 

If sign = 0 then FPSCR fprf 4 - '-{-zero' 

If sign = 1 then FPSCR fprf 4 - '—zero' 

End 

If frac > 0 then 
Do 

lffrac 0 = 1 then 
Do 

If sign = 0 then FPSCRpfjpp 4 - '+normal number' 

If sign = 1 then FPSCR fprf 4 - '—normal number' 

End 

If frac 0 = 0 then 
Do 

If sign = 0 then FPSCR fprf 4 - '-fdenormalized number' 
If sign = 1 then FPSCR fprf 4- '—denormalized number' 
End 

Normalize operand: 

Do while frac 0 = 0 
exp 4 - exp—1 

frac II G || R 4- frac 152 || G || R || ObO 
End 

FRT 0 4- sign 
FRT V11 4- exp + 1023 
FRT 12:6 3 4-frac 1;23 || 29 0 

End 

Done 
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Enabled Exponent Underflow: 


FPSCR UX 4- 1 
sign 4- (FRB) 0 
If (FRB) 1:11 = 0 then 
Do 

exp 4-1022 

frac 4- ObO || (FRB) 12:63 
End 

If (FRB) V11 > 0 then 
Do 

exp 4- (FRB) V11 - 1023 
frac 4- Obi || (FRB) 12 . 63 
End 

Normalize operand: 

Do while frac 0 * 0 
exp 4- exp — 1 
frac 4- frac 1;52 || ObO 
End 

Round single(sign,exp,frac,0,0,0) 

FPSCRxx 4- FPSCRxx | FPSCR f , 

exp 4- exp + 192 

FRT 0 4- sign 

FRT V11 4- exp + 1023 

p RT 12:63 frac 1:23 II ^ 

If sign = 0 then FPSCR fprf 4- "+normal number" 
If sign = 1 then FPSCR fprf 4- "—normal number" 
Done 


Disabled Exponent Overflow: 

FPSCR 0 x 4- 1 

If FPSCR rn = ObOO then /* Round to Nearest */ 

Do 

If (FRB) 0 = 0 then FRT 4- 0x7FF0_0000_0000_0000 
If (FRB) 0 = 1 then FRT 4- OxFFFO_0000_0000_0000 
If (FRB) 0 = 0 then FPSCR fprf 4- "+infinity" 

If (FRB) 0 = 1 then FPSCR fprf 4- "—infinity" 

End 

If FPSCR rn — ObOl then /* Round Truncate */ 

Do 

If (FRB) 0 - 0 then FRT 4- Ox47EF_FFFF_EOOO_0000 
If (FRB) 0 = 1 then FRT 4- OxC7EF_FFFF_E000_O00O 
If (FRB) 0 = 0 then FPSCR fprf 4- "+normal number" 
If (FRB) 0 = 1 then FPSCR fprf 4- "—normal number' 
End 

If FPSCR rn = OblO then I* Round to + Infinity */ 

Do 

If (FRB) 0 - 0 then FRT 4- Ox7FFO_0000_0000_0000 
If (FRB) 0 == 1 then FRT 4- OxC7EF_FFFF_EO00_O0OO 
If (FRB) 0 = 0 then FPSCR fprf 4- "+infinity" 

If (FRB) 0 = 1 then FPSCR fprf 4- "—normal number" 
End 

If FPSCR rn = Obi 1 then /* Round to —Infinity */ 

Do 

If (FRB)q = 0 then FRT 4- 0x47EF_FFFF_E000_0000 
If (FRB)o = 1 then FRT 4- OxFFFO_0000_0000_0000 
If (FRB) 0 = 0 then FPSCR fprf 4- "+normal number" 
If (FRB) 0 = 1 then FPSCR fprf 4- "—infinity" 

End 

FPSCR fr 4- undefined 

FPSCR f , 4- 1 

FPSCRf 4- 1 

Done 
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Enabled Exponent Overflow: 


sign <- (FRB ) 0 

exp 4- (FRB),.,, — 1023 

frac - Obi || (FRB ) 12:63 

Round single(sign,exp,frac,0,0,0) 

FPSCRxx 4 - FPSCR^ | FPSCR f , 

Enabled Overflow: 

FPSCR ox 4 - 1 
exp exp — 192 
FRT 0 +- sign 
FRT,.,, *- exp + 1023 
FRT 12:63 - frac , :23 || 29 0 

If sign = 0 then FPSCR fprf 4 - '+normal number' 
If sign = 1 then FPSCR fprf 4 - '—normal number' 
Done 


Zero Operand: 

FRT <- (FRB) 

If (FRB ) 0 = 0 then FPSCR fprf 4 - '+zero' 
If (FRB ) 0 - 1 then FPSCR^p 4 - '-zero' 
FPSCR fr f , 4 - ObOO 
Done 


Infinity Operand: 

FRT 4 - (FRB) 

If (FRB ) 0 = 0 then FPSCR fprf 4- '+infinity' 
If (FRB ) 0 = 1 then FPSCR fprf 4- '-infinity' 
FPSCR fr F | 4- ObOO 
Done 


QNaN Operand: 

FRT (FRB ) 0 34 II ^0 
FPSCR fprf 4- 'QNaN' 
FPSCR fr fi «- 0b00 
Done 


SNaN Operand: 


FPSCRvxsnan 4- 1 

If FPSCR ve = 0 then 
Do 

FRT 0;11 4- (FRB ) 0;11 
FRT 12 4- 1 

FRT , 3;63 4- (FRB ) 13 . 34 || ^0 
FPSCR fprf 'QNaN' 

End 

FPSCR fr fi 0b00 
Done 
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Normal Operand : 

sign 4 - (FRB ) 0 

exp 4 - (FRB ) V11 - 1023 

frac 4 - Obi || (FRB ) 12;63 

Round single(sign,exp,frac,0,0,0) 

FPSCRxx 4 - FPSCRxx | FPSCR f1 

If exp > +127 and FPSCR 0E = 0 then go to Disabled Exponent Overflow 
If exp > +127 and FPSCR og = 1 then go to Enabled Overflow 
FRT 0 4- sign 
FRT 1;11 4 - exp + 1023 

FRT 12:63 +~ ^ rac 1:23 II ^ 

If sign = 0 then FPSCRpp RF 4 - '+normal number'” 

If sign = 1 then FPSCRpp RF 4 - '—normal number' 

Done 


Round single(sign ,exp ,frac,Gft J(): 


inc 4- 0 
Isb 4- frac 23 
gbit 4- frac 24 
rbit 4- frac 25 

xbit 4- (frac 26 52 ||G||R||X)#0 
If FPSCR rn = ObOO then 
Do 

If sign || Isb || gbit || rbit || xbit = Obul luu then inc 4- 1 
If sign || Isb || gbit || rbit || xbit = ObuOl 1u then inc 4- 1 
If sign || Isb || gbit || rbit || xbit = ObuOl ul then inc 4- 1 
End 

If FPSCR rn = 0b10 then 
Do 

If sign || Isb || gbit || rbit || xbit = ObOuluu then inc 4- 1 
If sign || Isb || gbit || rbit || xbit = ObOuulu then inc 4 - 1 
If sign || Isb |j gbit || rbit H xbit = ObOuuul then inc 4 - 1 
End 

If FPSCR rn = 0b11 then 
Do 

If sign || Isb || gbit || rbit || xbit = Obluluu then inc 4- 1 
If sign || Isb || gbit || rbit || xbit = Obluulu then inc 4- 1 
If sign || Isb || gbit || rbit || xbit = Obluuul then inc 4- 1 
End 

frac 0;23 4 - frac 0:23 + inc 
lfcarry_out = 1 then 
Do 


frac 0 23 4- Obi || frac 0 22 
exp 4- exp + 1 


End 


FPSCR fr 4- inc 

FPSCR f , 4- gbit | rbit | xbit 

Return 


r comparison ignores u bits */ 
r comparison ignores u bits */ 
I* comparison ignores u bits */ 


r comparison ignores u bits */ 
r comparison ignores u bits */ 
I* comparison ignores u bits */ 


1* comparison ignores u bits */ 
/* comparison ignores u bits */ 
I* comparison ignores u bits */ 
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B.2 Floating-Point Convert to Integer Model 


The following describes algorithmically the operation of the Floating Convert to Integer instructions. 

If Floating Convert to Integer Word 
Then Do 

Then round_mode 4 - FPSCR rn 
tgt_precision «- '32-bit integer' 

End 

If Floating Convert to Integer Word with round toward Zero 
Then Do 

round_mode <- ObOl 
tgt_precision «- '32-bit integer' 

End 

If Floating Convert to Integer Doubleword 
Then Do 

round_mode «- FPSCR rn 
tgt_precision •*- '64-bit integer' 

End 

If Floating Convert to Integer Doubleword with round toward Zero 
Then Do 

round_mode «- ObOl 
tgt_precision 4- '64-bit integer' 

End 

If (FRB ) 1;11 = 2047 and (FRB ) 12:63 = 0 then goto Infinity Operand 
If (FRB ) 1:11 = 2047 and (FRB ) 12 = 0 then goto SNaN Operand 
If (FRB ) 1:11 = 2047 and (FRB ) 12 = 1 then goto QNaN Operand 
If (FRB ) 1;11 > 1086 then goto Large Operand 

sign (FRB ) 0 

If (FRB ) 1:11 > 0 then exp «- (FRB ) 1;11 — 1023 /* exp — bias *1 
If (FRB ) 1;11 = 0 then exp 4 -1022 

If (FRB ) 1:11 > 0 then frac 0:64 4 - ObOl || (FRB ) 12:63 || n 0 /* normal */ 

If (FRB ) 1:11 = 0 then frac 0:64 4 - ObOO || (FRB ) 12;63 || n 0 /* denormal */ 

gbit || rbit || xbit 4 - ObOOO 

Do i = 1,63—exp /* do the loop 0 times if exp = 63 */ 

frac 0 64 || gbit || rbit || xbit 4 - ObO || frac 0 64 || gbit || (rbit | xbit) 

End 


Round lnteger(sign,frac,gbit,rbit,xbit,round_mode) 


If sign = 1 then frac 0:64 <—«frac 0;64 + 1 


If tgt_precision = 
If tgt_precision = 
If tgt_precision = 
If tgt_precision = 


'32-bit integer' and frac 0;64 > +2 31 —1 then goto Large Operand 
'64-bit integer' and frac 0:64 > +2 63 —1 then goto Large Operand 
'32-bit integer' and frac 0:64 < —2 31 then goto Large Operand 
'64-bit integer' and frac 0;64 < —2 63 then goto Large Operand 


FPSCRxx 4- FPSCRxx | FPSCRp, 

If tgt_precision = '32-bit integer' then FRT 4 - 0xuuuu_uuuu || frac 33;64 /* u is undefined hex digit */ 
If tgt_precision = '64-bit integer' then FRT 4- frac 1;6 4 
FPSCR fprf 4- undefined 
Done 
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Round lnteger(sign ,frac,gbit/bitj(bitfound_mode): 


If round_mode 
Do 

If sign || 
If sign || 
If sign |j 
End 

If round_mode 
Do 

If sign || 
If sign (I 
If sign || 
End 

If round_mode 
Do 

If sign || 
If sign || 
If sign || 
End 


= ObOO then 

frac 64 || gbit || rbit 
frac 64 |j gbit jj rbit 
frac 64 || gbit jj rbit 

= OblO then 

frac 64 || gbit || rbit 
frac 64 j| gbit jj rbit 
frac 64 || gbit || rbit 

= Obi 1 then 

frac 64 || gbit || rbit 
frac 64 |j gbit jj rbit 
frac 64 jj gbit jj rbit 


Obulluu then inc 
ObuOllu then inc 
ObuOlul then inc 


ObOuluu then inc 
ObOuulu then inc 
GbOuuul then inc 


Obluluu then inc *- 1 
Obluulu then inc *- 1 
Obluuul then inc «- 1 


frac 0;64 «- frac 0;64 + inc 
FPSCR fr «— inc 
FPSCRp, «- gbit | rbit | xbit 
Return 


r comparison ignores u bits *f 
r comparison ignores u bits */ 
r comparison ignores u bits */ 


r comparison ignores u bits */ 
r comparison ignores u bits */ 
r comparison ignores u bits V 


r comparison ignores u bits */ 
r comparison ignores u bits */ 
r comparison ignores u bits */ 


Infinity Operand: 

FPSCR fr F | vxcvi ObOOl 
If FPSCR ve = 0 then Do 

If tgt precision = '32-bit integer' then 
Do 

If sign = 0 then FRT +- 0xuuuu_uuuu_7FFF_FFFF t* u is undefined hex digit */ 
If sign = 1 then FRT «- 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit */ 

End 

Else 

Do 

If sign = 0 then FRT 0x7FFF_FFFF_FFFF_FFFF 
If sign = 1 then FRT •*- 0x8000_0000_0000_0000 

End 

FPSCRpp R p •*- undefined 
End 
Done 


SNaN Operand: 

FPSCRrr FI vxsnan vxcvi ObOOl 1 
If FPSCR ve = 0 then 
Do 

If tgt_precision = '32-bit integer' then FRT 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit */ 
If tgt_precision = '64-bit integer' then FRT «- 0x8000_0000_0000_0000 
FPSCRppRp undefined 
End 
Done 
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QNaN Operand: 


FPSCR fr p| vxcvi ObOOl 
If FPSCR ve = 0 then 
Do 

If tgt_precision = '32-bit integer' then FRT <- 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit */ 
If tgt_precision = '64-bit integer' then FRT <- 0x8000_0000_0000_0000 
FPSCR fprf *- undefined 
End 
Done 


Large Operand: 

FPSCR fr F | vxcvi •*“ ObOOl 
If FPSCR ve = 0 then Do 

If tgt_precision = '32-bit integer' then 
Do 

If sign = 0 then FRT 4 - 0xuuuu_uuuu_7FFF_FFFF /* u is undefined hex digit */ 
If sign = 1 then FRT ♦- 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit V 

End 

Else 

Do 

If sign = 0 then FRT ♦- 0x7FFF_FFFF_FFFF_FFFF 
If sign = 1 then FRT 0x8000_0000_0000_0000 
End 

FPSCR fprf 4- undefined 
End 
Done 
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B.3 Floating-Point Convert from Integer Model 


The following describes algorithmically the operation of the Floating Convert from Integer instructions. 

sign (FRB ) 0 
exp •*- 63 
frac 0 63 4 - (FRB) 

If frac 0:63 = 0 then go to Zero Operand 
If sign = 1 then frac 0:63 4 -.frac 0;63 + 1 

Do while frac 0 = 0 /* do the loop 0 times if (FRB) = maximum negative integer */ 
frac 0;63 4- frac 1:63 || ObO 
exp 4- exp — 1 

End 

Round Float(sign,exp,frac,FPSCR RN ) 

If sign = 1 then FPSCR fprf 4 - '—normal number' 

If sign = 0 then FPSCRfrp RF 4 - '-j-normal number' 

FRT 0 4- sign 

FRT V11 4 - exp + 1023 l* exp + bias */ 

FRTi2:63 ^ rac 1:52 
Done 


Zero Operand: 

FPSCR fr F | 4 - ObOO 
FPSCR fprf 4 - '+zero' 

FRT 4- 0x0000_0000_0000_0000 
Done 
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Round Floatfsign,exp Jfrac found_mode)\ 


inc <- 0 
Isb <- frac 52 
gbit «- frac 53 
rbit <- frac 54 
xbit <- frac 55;63 > 0 
If round_mode = ObOO then 
Do 

If sign || Isb || gbit || rbit || xbit — 

If sign || Isb || gbit || rbit || xbit = 

If sign || Isb || gbit || rbit || xbit = 

End 

If round_mode = OblO then 
Do 

If sign || Isb || gbit || rbit || xbit = 

If sign || Isb || gbit || rbit || xbit = 

If sign || Isb || gbit || rbit || xbit — 

End 

If round_mode = Obi 1 then 
Do 

If sign || Isb || gbit || rbit || xbit = 

If sign || Isb || gbit || rbit || xbit = 

If sign || Isb || gbit || rbit || xbit = 

End 

fr ac 0 52 «- frac 0:52 + inc 

If carry_out = 1 then exp «- exp + 1 

FPSCR fr <- inc 

FPSCRpi <- gbit | rbit | xbit 

FPSCRxx 4 - FPSCRxx | FPSCR f1 

Return 


Obulluu then inc 
ObuOllu then inc 
ObuOlul then inc 


ObOuluu then inc 
ObOuulu then inc 
ObOuuul then inc 


Obluluu then inc 
Obluulu then inc 
Obluuul then inc 


1 /* comparison ignores u bits */ 
1 /* comparison ignores u bits */ 
1 /* comparison ignores u bits */ 


1 I* comparison ignores u bits */ 
1 /* comparison ignores u bits */ 
1 /* comparison ignores u bits *! 


1 r comparison ignores u bits */ 
1 /* comparison ignores u bits */ 

1 r comparison ignores u bits */ 
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Appendix C. Assembler Extended Mnemonics 


In order to make assembler language programs simpler to write and easier to understand, a set of extended 
mnemonics and symbols is provided that defines simple shorthand for the most frequently used forms of Branch 
Conditional, Compare, Trap, Rotate and Shift, and certain other instructions. 

Assemblers should provide the mnemonics and symbols listed here, and may provide others. 


C.1 Branch mnemonics 

The mnemonics discussed in this section are variations of the Branch Conditional instructions. 


C.1.1 BO and Bl fields 

The 5-bit BO field in Branch Conditional instructions encodes the following operations: 

■ Decrement CTR 

■ Test CTR equal to 0 

■ Test CTR not equal to 0 

■ Test condition true 

■ Test condition false 

■ Branch prediction (taken, fall through) 

The 5-bit Bl field in Branch Conditional instructions specifies which of the 32 bits in the CR represents the condi¬ 
tion to test. 

To provide an extended mnemonic for every possible combination of BO and Bl fields would require 2 10 = 1024 
mnemonics. Most of these would be only marginally useful. The following abbreviated set is intended to cover 
the most useful cases. Unusual cases can be coded using a basic Branch Conditional mnemonic (be, bclr, beetr) 
with the condition to be tested specified as a numeric operand. 


C.1.2 Simple branch mnemonics 

The mnemonics in Table 2 allow all the useful BO encodings to be specified, along with the AA (absolute address) 
and LK (set Link Register) fields. 

Notice that there are no extended mnemonics for relative and absolute unconditional branches. For these the 
basic mnemonics b, ba, bl, and bla should be used. 
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Table 2. Simple branch mnemonics 

Branch semantics 

LR not set 

LR set 

be 

Relative 

bca 

Absolute 

bclr 

To LR 

beetr 

To CTR 

be! 

Relative 

bcla 

Absolute 

bclrl 

To LR 

bcctrl 

To CTR 

Branch unconditionally 

- 

- 

blr 

betr 

- 

- 

blrl 

bctrl 

Branch if condition true 

bt 

bta 

btlr 

btetr 

btl 

btl a 

btlrl 

btctrl 

Branch if condition false 

bf 

bfa 

bflr 

bfetr 

bfl 

bfla 

bflrl 

bfctrl 

Decrement CTR, 
branch if CTR non-zero 

bdnz 

bdnza 

bdnzlr 

- 

bdnzl 

bdnzla 

bdnzlrl 

- 

Decrement CTR, 
branch if CTR non-zero 

AND condition true 


bdnzta 

bdnztlr 

- 

bdnztl 

bdnztl a 

bdnztlrl 

- 

Decrement CTR, 
branch if CTR non-zero 

AND condition false 

bdnzf 

bdnzfa 

bdnzflr 

- 

bdnzfl 

bdnzfl a 

bdnzflrl 

- 

Decrement CTR, 
branch if CTR zero 

bdz 

bdza 

bdzlr 

- 

bdzl 

bdzla 

bdzlrl 

- 

Decrement CTR, 
branch if CTR zero 

AND condition true 

bdzt 

bdzta 

bdztlr 

- 

bdztl 

bdztl a 

bdztl rl 

- 

Decrement CTR, 
branch if CTR zero 

AND condition false 

bdzf 

bdzfa 

n 

- 

bdzfl 

bdzfl a 

bdzflrl 

- 


Instructions using one of the mnemonics in Table 2 that tests a condition specify the condition as the first 
operand of the instruction. The following symbols are defined for use in such an operand. They can be combined 
with other values in an expression that identifies the CR bit (0:31) to be tested. These symbols and expressions 
can also be used with the basic Branch Conditional mnemonics, to specify the Bl field. 


mbol 

Value 

Meaning 

It 

0 

Less than 

gt 

1 

Greater than 

eq 

2 

Equal 

so 

3 

Summary overflow 

un 

3 

Unordered (after fl< 

crO 

0 

CR field 0 

crl 

1 

CR field 1 

cr2 

2 

CR field 2 

cr3 

3 

CR field 3 

cr4 

4 

CR field 4 

cr5 

5 

CR field 5 

cr6 

6 

CR field 6 

cr7 

7 

CR field 7 


Examples 


1. Decrement CTR and branch if it is still non-zero (closure of a loop controlled by a count loaded into CTR). 

bdnz target (equivalent to: be 16,0,target) 

2. Same as (1) but branch only if CTR is non zero and condition in CRO is “equal.” 

bdnzt eq,target (equivalent to: be 8,2,target) 

3. Same as (2), but “equal" condition is in CR5. 

bdnzt 4*cr5 + eq,target (equivalent to: be 8,22,target) 

4. Branch if bit 27 of CR is false. 
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bf 27,target (equivalent to: be 4,27,target) 

5. Same as (4), but set the Link Register. This is a form of conditional "call.” 

bfl 27,target (equivalent to: bcl 4,27,target) 

C.1.3 Branch mnemonics incorporating conditions 

The mnemonics defined in Table 3 are variations of the “branch if condition true” and “branch if condition false” 
BO encodings, with the most useful values of Bl represented in the mnemonic rather than specified as a numeric 
operand. 

A standard set of codes has been adopted for the most common combinations of branch conditions. 

Code Meaning 

It Less than 

le Less than or equal 

eq Equal 

ge Greater than or equal 

gt Greater than 

nl Not less than 

ne Not equal 

ng Not greater than 

so Summary overflow 

ns Not summary overflow 

un Unordered (after floating-point comparison) 

nu Not unordered (after floating-point comparison) 

These codes are reflected in the mnemonics shown in Table 3. 


Table 3. Branch mnemonics incorporating conditions 

Branch semantics 

LR not set 

LR set 

be 

Relative 

bca 

Absolute 

bclr 

To LR 

beetr 

To CTR 

bcl 

Relative 

bcla 

Absolute 

bclrl 

To LR 

bcctrl 

To CTR 

Branch if less than 

bit 

blta 

bltlr 

bltctr 

bltl 

bltla 

bltlrl 

bltctrl 

Branch if less than or equal 

ble 

blea 

blelr 

blectr 

blel 

blela 

blelri 

blectrl 

Branch if equal 

beq 

beq a 

beqlr 

beqetr 

beql 

beql a 

beqlrl 

beqctrl 

Branch if greater than or equal 

bge 

bgea 

bgelr 

bgectr 

bgel 

bgel a 

bgelrl 

bgectrl 

Branch if greater than 

bgt 

bgta 

bgtlr 

bgtetr 

bgtl 

bgtl a 

bgtlrl 

bgtctrl 

Branch if not less than 

bnl 

bnl a 

bnllr 

bnlctr 

bnll 

bnlla 

bnllrl 

bnictrl 

Branch if not equal 

bne 

bne a 

bnelr 

bnectr 

bnel 

bnel a 

bnelrl 

bnectrl 

Branch if not greater than 

bng 

bng a 

bnglr 

bngetr 

bngl 

bngl a 

bnglrl 

bngctrl 

Branch if summary overflow 

bso 

bsoa 

bsolr 

bsoctr 

bsol 

bsol a 

bsolrl 

bsoctrl 

Branch if not summary overflow 

bns 

bnsa 

bnslr 

bnsetr 

bnsl 

bnsla 

bnslrl 

bnsctrl 

Branch if unordered 

bun 

buna 

bunlr 

bunctr 

bunl 

bunl a 

bunlrl 

bunctrl 

Branch if not unordered 

bnu 

bnua 

bnulr 

bnuctr 

bnul 

bnula 

bnulrl 

bnuctrl 


Instructions using the mnemonics in Table 3 specify the Condition Register field in an optional first operand. If 
the CR field being tested is CRO, this operand need not be specified. Otherwise, one of the CR field symbols 
listed earlier is coded as the first operand. 
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Examples 


1. Branch if CRO reflects condition “not equal.” 


bne target 

(equivalent to: 

be 

4,2,target) 

Same as (1), but condition is in CR3. 




bne cr3,target 

(equivalent to: 

be 

4,14, target) 

Branch to an absolute target if CR4 specifies 
conditional “call.” 

“greater than,” 

setting the 

Link Register. 

bgtla cr4,target 

(equivalent to: 

bcla 

12,17,target) 

Same as (3), but target address is in the Count Register. 



bgtctrl cr4 

(equivalent to: 

bcctrl 

12,17) 


This is a form of 


C.1.4 Branch prediction 

In Branch Conditional instructions that are not always taken, the low-order bit (“y” bit) of the BO field provides a 
hint about whether the branch is likely to be taken: see the discussion of the “y” bit in Section 2.4.1, Branch 
Instructions, on page 19. 

Assemblers should set this bit to 0 unless otherwise directed. This default action means that: 

■ A Branch Conditional with a negative displacement field is predicted to be taken. 

■ A Branch Conditional with a non-negative displacement field is predicted not to be taken (fall through). 

■ A Branch Conditional to an address in the LR or CTR is predicted not to be taken (fall through). 

If the likely outcome (branch or fall through) of a given Branch Conditional instruction is known, a suffix can be 
added to the mnemonic that tells the assembler how to set the “y” bit. 

+ Predict branch to be taken. 

— Predict branch not to be taken. 

Such a suffix can be added to any Branch Conditional mnemonic, either basic or extended. 

For relative and absolute branches (bc[f][a]), the setting of the “y” bit depends on whether the displacement field 
is negative or non-negative. For negative displacement fields, coding the suffix “ + ” causes the bit to be set to 0, 
and coding the suffix “ — ” causes the bit to be set to 1. For non-negative displacement fields, coding the suffix 
“ + ” causes the bit to be set to 1, and coding the suffix “ —” causes the bit to be set to 0. 

For branches to an address in the LR or CTR (bc/r[J] or bccfr[/]), coding the suffix “ + ” causes the “y” bit to be 
set to 1, and coding the suffix “ — ” causes the bit to be set to 0. 

Examples 

1. Branch if CRO reflects condition “less than,” specifying that the branch should be predicted to be taken. 

bit + target 

2. Same as (1), but target address is in the Link Register and the branch should be predicted not to be taken. 

bltlr — 


C.2 Condition Register logical mnemonics 

The Condition Register Logical instructions can be used to set (to 1), clear (to 0), copy, or invert a given Condition 
Register bit. Extended mnemonics are provided that allow these operations to be coded easily. 
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Table 4. Condition Register logical mnemonics 

Operation 

Extended mnemonic 

Equivalent to 

Condition Register set 

crset bx 

creqv bx,bx,bx 

Condition Register clear 

crclr bx 

crxor bx,bx,bx 

Condition Register move 

crmove bx.by 

cror bx.by,by 

Condition Register not 

crnot bx.by 

crnor bx.by,by 


Examples 

1. Set CR bit 25. 

crset 25 (equivalent to: 

2. Clear the SO bit of CRO. 

crclr so (equivalent to: 

3. Same as (2), but SO bit to be cleared is in CR3. 

crclr 4*cr3 + so (equivalent to: 

4. Invert the EO bit. 

crnot eq,eq (equivalent to: 

5. Same as (4), but EO bit to be inverted is in CR4, and the result is 

crnot 4*cr5 + eq,4*cr4 + eq (equivalent to: 


creqv 25,25,25) 
crxor 3,3,3) 
crxor 15,15,15) 
crnor 2,2,2) 

to be placed into the EO bit of CR5. 
crnor 22,18,18) 


C.3 Subtract mnemonics 
C.3.1 Subtract immediate 

Although there is no “Subtract Immediate” instruction, its effect can be achieved by using an Add Immediate 
instruction with the immediate operand negated. Extended mnemonics are provided that include this negation, 
making the intent of the computation clearer. 


subi 

Rx.Ry, value 

(equivalent to: 

addi 

Rx.Ry,-value) 

subis 

Rx,Ry, value 

(equivalent to: 

addis 

Rx,Ry, — value) 

subic 

Rx.Ry, value 

(equivalent to: 

addic 

Rx,Ry, — value) 

subic. 

Rx,Ry, value 

(equivalent to: 

addic. 

Rx.Ry,-value) 


C.3.2 Subtract 


The Subtract From instructions subtract the second operand (RA) from the third (RB). Extended mnemonics are 
provided that use the more “normal" order, in which the third operand is subtracted from the second. Both these 
mnemonics can be coded with a final “o” and/or to cause the OE and/or Rc bit to be set in the underlying 
instruction. 


sub Rx.Ry.Rz 
subc Rx,Ry,Rz 


(equivalent to: 
(equivalent to: 


subf Rx,Rz,Ry) 
subfc Rx,Rz,Ry) 
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C.4 Compare mnemonics 


The L field in the fixed-point Compare instructions controls whether the operands are treated as 64-bit quantities 
(L=1) or as 32-bit quantities (L=0). Extended mnemonics are provided that represent the L value in the mne¬ 
monic rather than requiring it to be coded as a numeric operand. 

The BF field can be omitted if the result of the comparison is to be placed in CR Field 0. Otherwise the target CR 
field must be specified as the first operand, using one of the CR field symbols listed above or an explicit field 
number. 

Note: The basic Compare mnemonics of PowerPC are the same as those of Power, but the Power instructions 
have three operands while the PowerPC instructions have four. The assembler will recognize a basic Compare 
mnemonic with three operands as the Power form, and will generate the instruction with L=0. (Thus the assem¬ 
bler must require that the BF field, which normally can be omitted when CR Field 0 is the target, be specified 
explicitly if L is.) 
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C.4.1 Doubleword comparisons 


These operations are available only in 64-bit implementations. 


Table 5. Doubleword compare mnemonics 

Operation 

Extended mnemonic 

Equivalent to 

Compare doubleword immediate 

cmpdi bf,ra,si 

cmpi bf,1,ra,si 

Compare doubleword 

cmpd bf.ra.rb 

cmp bf,1,ra,rb 

Compare logical doubleword immediate 

cmpldi bf,ra,ui 

cmpli bf,1,ra,ui 

Compare logical doubleword 

cmpld bf,ra,rb 

cmpi bf,1,ra,rb 


Examples 

1. Compare logical (unsigned) 64 bits in register Rx with immediate value 100 and place result in CRO. 

cmpldi Rx,100 (equivalent to: cmpli 0,1,Rx,100) 

2. Same as (1), but place results in CR4. 

cmpldi cr4,Rx,100 (equivalent to: cmpli 4,1,Rx,100) 

3. Compare registers Rx and Ry as signed 64-bit quantities and place result in CRO. 

cmpd Rx,Ry (equivalent to: cmp 0,1,Rx,Ry) 

C.4.2 Word comparisons 


These operations are available in all implementations. 


Table 6. Word compare mnemonics 

Operation 

Extended mnemonic 

Equivalent to 

Compare word immediate 

cmpwi bf,ra,si 

cmpi bf,0,ra,si 

Compare word 

cmpw bf,ra,rb 

cmp bf,0,ra,rb 

Compare logical word immediate 

cmpiwi bf,ra,ui 

cmpli bf,0,ra,ui 

Compare logical word 

cmplw bf,ra,rb 

cmpi bf,0,ra,rb 


Examples 

1. Compare 32 bits in register Rx with immediate value 100 and 

cmpwi Rx,100 (equivalent 

2. Same as (1), but place results in CR4. 

cmpwi cr4,Rx,100 (equivalent 

3. Compare registers Rx and Ry as logical 32-bit quantities and 

cmplw Rx,Ry (equivalent 


place result in CRO. 

to: cmpi 0,0, Rx, 100) 

to: cmpi 4,0, Rx, 100) 

place result in CRO. 

to: cmpi 0,0,Rx,Ry) 
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C.5 Trap mnemonics 


The mnemonics defined in Table 7 are variations of the Trap instructions, with the most useful values of TO 
represented in the mnemonic rather than specified as a numeric operand. 

A standard set of codes has been adopted for the most common combinations of trap conditions. 


Code 

Meaning 

TO encoding 

< 

> 

= 


It 

Less than 

16 

1 

0 

0 

0 0 

le 

Less than or equal 

20 

1 

0 

1 

0 0 

eq 

Equal 

4 

0 

0 

1 

0 0 

ge 

Greater than or equal 

12 

0 

1 

1 

0 0 

gt 

Greater than 

8 

0 

1 

0 

0 0 

nl 

Not less than 

12 

0 

1 

1 

0 0 

ne 

Not equal 

24 

1 

1 

0 

0 0 

ng 

Not greater than 

20 

1 

0 

1 

0 0 

lit 

Logically less than 

2 

0 

0 

0 

1 0 

lie 

Logically less than or equal 

6 

0 

0 

1 

1 0 

Ige 

Logically greater than or equal 

5 

0 

0 

1 

0 1 

•gt 

Logically greater than 

1 

0 

0 

0 

0 1 

Ini 

Logically not less than 

5 

0 

0 

1 

0 1 

Ing 

Logically not greater than 

6 

0 

0 

1 

1 0 

(none) 

Unconditional 

31 

1 

1 

1 

1 1 


These codes are reflected in the mnemonics shown in Table 7. 


Table 7. Trap mnemonics 

Trap semantics 

64-bit comparison 

32-bit comparison 

tdi 

Immediate 

td 

Register 

twi 

Immediate 

tw 

Register 

Trap unconditionally 

- 

- 

- 

trap 

Trap if less than 

tdlti 

tdlt 

twlti 

twit 

Trap if less than or equal 

tdlei 

tdle 

twlei 

twle 

Trap if equal 

tdeqi 

tdeq 

tweqi 

tweq 

Trap if greater than or equal 

tdgei 

tdge 

twgei 

twge 

Trap if greater than 

tdgti 

tdgt 

twgti 

twgt 

Trap if not less than 

tdnli 

tdnl 

twnli 

twnl 

Trap if not equal 

tdnei 

tdne 

HESSSH 

twne 

Trap if not greater than 

tdngi 

tdng 


twng 

Trap if logically less than 

tdllti 

tdllt 

I 

twllt 

Trap if logically less than or equal 

tdi lei 

tdlle 

WBMm 

twlle 

Trap if logically greater than or equal 

tdlgei 

tdlge 

mwmm 

twlge 

Trap if logically greater than 

tdlgti 

tdlgt 


twlgt 

Trap if logically not less than 

tdlnli 

tdlnl 

| 

twlnl 

Trap if logically not greater than 

tdlngi 

tdlng 


twlng 


Examples 

1. Trap if 64-bit register Rx is not 0. 

tdnei Rx,0 (equivalent to: tdi 24,Rx,0) 
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2. Same as (1), but comparison is to register Ry. 

tdne Rx,Ry (equivalent to: td 24,Rx,Ry) 

3. Trap if register Rx, considered as a 32-bit quantity, is logically greater than 0x7FF. 

twlgti Rx,0x7FF (equivalent to: twi 1,Rx,0x7FF) 

4. Trap unconditionally. 

trap (equivalent to: tw 31,0,0) 

C.6 Rotate and Shift mnemonics 

The Rotate and Shift instructions provide powerful and general ways to manipulate register contents, but can be 
difficult to understand. Extended mnemonics are provided that allow some of the simpler operations to be coded 
easily. 

Mnemonics are provided for the following types of operation: 

Extract Select a field of n bits starting at bit position b in the source register; right or left justify this field in 
the target register; clear all other bits of the target register to 0. 

Insert Select a left-justified or right-justified field of n bits in the source register; insert this field starting at 
bit position b of the target register; leave other bits of the target register unchanged. (No extended 
mnemonic is provided for insertion of a left-justified field when operating on doublewords, because 
such an insertion requires more than one instruction.) 

Rotate Rotate the contents of a register right or left n bits without masking. 

Shift Shift the contents of a register right or left n bits, clearing vacated bits to 0 (logical shift). 

Clear Clear the leftmost or rightmost n bits of a register to 0. 

Clear left and shift left 

Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can be used 
to scale a (known non-negative) array index by the width of an element. 

C.6.1 Operations on doublewords 


These operations are available only in 64-bit implementations. All these mnemonics can be coded with a final 
to cause the Rc bit to be set in the underlying instruction. 


Table 8. Doubleword rotate and shift mnemonics 

Operation 

Extended mnemonic 

Equivalent to 

Extract and left justify immediate 

extldi ra,rs,n,b 

rldicr ra,rs,6,n —1 

Extract and right justify immediate 

extrdi ra,rs,n,f> 

rldicl ra,rs,6 + n,64— n 

Insert from right immediate 

insrdi ra,rs,n,6 

rldimi ra,rs,64 — {b + n),b 

Rotate left immediate 

rotldi ra,rs,n 

rldicl ra,rs,n,0 

Rotate right immediate 

rotrdi ra,rs,n 

rldicl ra,rs,64 — n,0 

Rotate left 

rotld ra,rs,rb 

ridel ra,rs,rb,0 

Shift left immediate 

sldi ra,rs,n 

rldicr ra,rs,n,63-n 

Shift right immediate 

srdi ra,rs,n 

rldicl ra,rs,64 — n,n 

Clear left immediate 

clrldi ra,rs ,n 

rldicl ra,rs,0,/7 

Clear right immediate 

clrrdi ra,rs,/i 

rldicr ra,rs,0,63 — n 

Clear left and shift left immediate 

clrlsldi ra,rs,6,/? 

rldic ra,rs,n,£>-n 
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Examples 


1. Extract the sign bit (bit 0) of register Ry and place the result right-justified into register Rx. 

extrdi Rx,Ry,1,0 (equivalent to: rldicl Rx.Ry.1,63) 

2. Insert the bit extracted in (1) into the sign bit (bit 0) of register Rz. 

insrdi Rz,Rx,1,0 (equivalent to: 

3. Shift the contents of register Rx left 8 bits. 

sldi Rx,Rx,8 (equivalent to: 

4. Clear the high-order 32 bits of Ry and place the result into Rx. 

clrldi Rx,Ry,32 (equivalent to: 


rldimi Rz,Rx,63,0) 
rldicr Rx,Rx,8,55) 
rldicl Rx,Ry,0,32) 


C.6.2 Operations on words 


These operations are available in all implementations. All these mnemonics can be coded with a final to 
cause the Rc bit to be set in the underlying instruction. 


Table 9. Word rotate and shift mnemonics 

Operation 

Extended mnemonic 

Equivalent to 

Extract and left justify immediate 

extlwi ra,rs,n,6 

rlwinm ra,rs,b,0,n — 1 

Extract and right justify immediate 

extrwi ra,rs ,n,b 

rlwinm ra,rs,6 + n,32 —/?,31 

Insert from left immediate 

inslwi ra,rs,n,6 

rlwimi ra,rs,32 — b,b,(b + n) — 1 

Insert from right immediate 

insrwi ra,rs ,n,b 

rlwimi ra,rs,32-(f> + /?),b,(b + n) — 1 

Rotate left immediate 

rotlwi ra,rs,/7 

rlwinm ra,rs,n,0,31 

Rotate right immediate 

rotrwi ra,rs,n 

rlwinm ra,rs,32-n,0,31 

Rotate left 

rotlw ra,rs,rb 

rlwnm ra,rs,rb,0,31 

Shift left immediate 

slwi ra,rs,n 

rlwinm ra,rs,/7,0,31 — n 

Shift right immediate 

srwi ra,rs,n 

rlwinm ra,rs,32 —n,n,31 

Clear left immediate 

clrlwi ra,rs,/7 

rlwinm ra,rs,0,v7,31 

Clear right immediate 

clrrwi ra,rs,n 

rlwinm ra,rs,0,0,31 — n 

Clear left and shift left immediate 

clrlslwi ra,rs,b,n 

rlwinm ra,rs,n,b — n,31 — n 


Examples 

1. Extract the sign bit (bit 32) of register Ry and place the result right-justified into register Rx. 

extrwi Rx,Ry,1,0 (equivalent to: rlwinm Rx,Ry,1,31,31) 

2. Insert the bit extracted in (1) into the sign bit (bit 32) of register Rz. 

insrwi Rz,Rx,1,0 (equivalent to: rlwimi Rz,Rx,31,0,0) 

3. Shift the contents of register Rx left 8 bits, clearing the high-order 32 bits. 

slwi Rx,Rx,8 (equivalent to: rlwinm Rx,Rx,8,0,23) 

4. Clear the high-order 16 bits of the low-order 32 bits of Ry and place the result into Rx, clearing the high-order 
32 bits of Rx. 

clrlwi Rx,Ry,16 (equivalent to: rlwimn Rx,Ry,0,16,31) 
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C.7 Move To/From Special Purpose Register mnemonics 


The mtspr and mfspr instructions specify a Special Purpose Register (SPR) as a numeric operand. Extended mne¬ 
monics are provided that represent the SPR in the mnemonic rather than requiring it to be coded as an operand. 
Also shown here are extended mnemonics for Move From Time Base and Move From Time Base Upper, which 
are variants of the mftb instruction rather than of mfspr. 

Note: mftb serves as both a basic and an extended mnemonic. The assembler will recognize an mftb mnemonic 
with two operands as the basic form, and an mftb mnemonic with one operand as the extended form. 


Table 10. Extended mnemonics for moving to/from an SPR 

Special Purpose Register 

Move To SPR 

Move From SPR 1 

Extended 

Equivalent to 

Extended 

Equivalent to 

Fixed Point Exception 
Register 

mtxer Rx 

mtspr 1,Rx 

mfxer Rx 

mfspr Rx,1 

Link Register 

mtir Rx 

mtspr 8,Rx 

mflr Rx 

mfspr Rx,8 

Count Register 

mtctr Rx 

mtspr 9,Rx 

mfctr Rx 

mfspr Rx,9 

Data Storage Interrupt 

Status Register 

mtdsisr Rx 

mtspr 18,Rx 

mfdsisr Rx 

mfspr Rx,18 

Data Address Register 

mtdar Rx 

mtspr 19,Rx 

mfdar Rx 

mfspr Rx,19 

Decrementer 

mtdec Rx 

mtspr 22, Rx 

mfdec Rx 

mfspr Rx,22 

Storage Description 

Register 1 

mtsdrl Rx 

mtspr 25, Rx 

mfsdrl Rx 

mfspr Rx,25 

Save/Restore Register 0 

mtsrrO Rx 

mtspr 26, Rx 

mfsrrO Rx 

mfspr Rx,26 

Save/Restore Register 1 

mtsrrl Rx 

mtspr 27, Rx 

mfsrrl Rx 

mfspr Rx,27 

Special Purpose Registers 

GO through G3 

mtsprg n,Rx 

mtspr 272 + n,Rx 

mfsprg Rx,n 

mfspr Rx,272 + n 

Address Space Register 

mtasr Rx 

mtspr 280, Rx 

mfasr Rx 

mfspr Rx,280 

External Access Register 

mtear Rx 

mtspr 282, Rx 

mfear Rx 

mfspr Rx,282 

Time Base [Lower] 

mttbl Rx 

mtspr 284, Rx 

mftb Rx 

mftb Rx,268 

Time Base Upper 

mttbu Rx 

mtspr 285, Rx 

mftbu Rx 

mftb Rx,269 

Processor Version Register 

- 

- 

mfpvr Rx 

mfspr Rx,287 

IBAT Registers, Upper 

mtibatu n, Rx 

mtspr 528 + 2xn,Rx 

mfibatu Rx,n 

mfspr Rx,528 + 2x/7 

IBAT Registers, Lower 

mtibatl n,Rx 

mtspr 529 + 2xn,Rx 

mfibatl Rx,n 

mfspr Rx,529 + 2x/7 

DBAT Registers, Upper 

mtdbatu n,Rx 

mtspr 536 + 2xv7,Rx 

mfdbatu Rx,n 

mfspr Rx,536 + 2xn 

DBAT Registers, Lower 

mtdbatl n,Rx 

mtspr 537 + 2xn,Rx 

mfdbatl Rx,n 

mfspr Rx,537 + 2xn 

1 Except for mftb and mftbu. 


Examples 

1. Copy the contents of the low-order 32 bits of Rx to the XER. 


mtxer Rx 

(equivalent to: 

mtspr 1,Rx) 

Copy the contents of the LR to register Rx. 



mflr Rx 

(equivalent to: 

mfspr Rx,8) 

Copy the contents of Rx to the CTR. 



mtctr Rx 

(equivalent to: 

mtspr 9,Rx) 
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C.8 Miscellaneous mnemonics 


No-op 

Many PowerPC instructions can be coded in a way such that, effectively, no operation is performed. An extended 
mnemonic is provided for the “preferred” form of no-op. If an implementation performs any type of run-time 
optimization related to no-ops, the preferred form is the no-op that will trigger this. 

nop (equivalent to: ori 0,0,0) 

Load Immediate 

The addi and addis instructions can be used to load an immediate value into a register. Extended mnemonics are 
provided to convey the idea that no addition is being performed but merely data movement (from the immediate 
field of the instruction to a register). 

Load a 16-bit signed immediate value into register Rx: 

li Rx,value (equivalent to: addi Rx,0,value) 

Load a 16-bit signed immediate value, shifted left by 16 bits, into register Rx: 

lis Rx,value (equivalent to: addis Rx,0,value) 

Load Address 

This mnemonic permits computing the value of a base-displacement operand, using the addi instruction which 
normally requires separate register and immediate operands. 

la Rx,D(Ry) (equivalent to: addi Rx,Ry,D) 

The la mnemonic is useful for obtaining the address of a variable specified by name, allowing the assembler to 
supply the base register number and compute the displacement. If the variable v is located at offset Dv bytes 
from the address in register Rv, and the assembler has been told to use register Rv as a base for references to 
the data structure containing v, then the following line causes the address of v to be loaded into register Rx. 

la Rx,v (equivalent to: addi Rx,Rv,Dv) 

Move Register 

Several PowerPC instructions can be coded in a way such that they simply copy the contents of one register to 
another. An extended mnemonic is provided to convey the idea that no computation is being performed but 
merely data movement (from one register to another). 

The following instruction copies the contents of register Ry into register Rx. This mnemonic can be coded with a 
final to cause the Rc bit to be set in the underlying instruction. 

mr Rx,Ry (equivalent to: or Rx,Ry,Ry) 

Complement Register 

Several PowerPC instructions can be coded in a way such that they complement the contents of one register and 
place the result into another register. An extended mnemonic is provided that allows this operation to be coded 
easily. 

The following instruction complements the contents of register Ry and places the result into register Rx. This 
mnemonic can be coded with a final to cause the Rc bit to be set in the underlying instruction. 

not Rx,Ry (equivalent to: nor Rx,Ry,Ry) 


232 PowerPC Architecture First Edition 



Appendix D. Little-Endian Byte Ordering 


It is computed that eleven Thousand Persons 
have, at several Times, suffered Death, 
rather than submit to break their Eggs at the 
smaller End. Many hundred large Volumes 


have been published upon this Controversy 


Jonathan Swift, Gulliver's Travels 


D.1 Byte Ordering 

If scalars (individual data items and instructions) were 
indivisible, then there would be no such concept as 
“byte ordering.” It is meaningless to talk of the 
“order” of bits or groups of bits within the smallest 
addressable unit of storage, because nothing can be 
observed about such order. Only when scalars, which 
the programmer and processor regard as indivisible 
quantities, can be made up of more than one address¬ 
able unit of storage does the question of “order” 
arise. 

For a machine in which the smallest addressable unit 
of storage is the 64-bit doubleword, there is no ques¬ 
tion of the ordering of “bytes” within doublewords. 
All transfers of individual scalars to and from storage 
(e.g., between registers and storage) are of 
doublewords, and the address of the “byte” con¬ 
taining the high-order 8 bits of a scalar is no different 
from the address of a “byte” containing any other 
part of the scalar. 

For PowerPC, as for most computers currently avail¬ 
able, the smallest addressable unit of storage is the 
8-bit byte. Many scalars are halfwords, words, or 
doublewords, which consist of groups of bytes. When 
a word-length scalar is moved from a register to 
storage, the scalar occupies four consecutive byte 
addresses. It thus becomes meaningful to discuss the 
order of the byte addresses with respect to the value 
of the scalar: which byte contains the highest-order 8 
bits of the scalar, which byte contains the next- 
highest-order 8 bits, and so on. 

Given a scalar that spans multiple bytes, the choice of 
byte ordering is essentially arbitrary. There are 
4! = 24 ways to specify the ordering of four bytes 


within a word, but only two of these orderings are 
sensible: 

■ The ordering that assigns the lowest address to 
the highest-order (“leftmost”) 8 bits of the scalar, 
the next sequential address to the next-highest- 
order 8 bits, and so on. This is called Big-Endian 
because the “big end” of the scalar, considered 
as a binary number, comes first in storage. IBM 
RISC System/6000, IBM System/370, and 
Motorola 680x0 are examples of computers using 
this byte ordering. 

■ The ordering that assigns the lowest address to 
the lowest-order (“rightmost”) 8 bits of the scalar, 
the next sequential address to the next-lowest- 
order 8 bits, and so on. This is called Little- 
Endian because the “little end” of the scalar, 
considered as a binary number, comes first in 
storage. DEC VAX and Intel x86 are examples of 
computers using this byte ordering. 

D.2 Structure Mapping 
Examples 

Figure 72 on page 234 shows an example of a C lan¬ 
guage structure s containing an assortment of scalars 
and one character string. The value assumed to be in 
each structure element is shown in hex in the C com¬ 
ments; these values are used below to show how the 
bytes making up each structure element are mapped 
into storage. 

C structure mapping rules permit the use of padding 
(skipped bytes) in order to align the scalars on desir¬ 
able boundaries. Figures 73 and 74 show each scalar 
aligned at its natural boundary. This introduces 
padding of 4 bytes between a and b, one byte 
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struct { 


int 

a; 

r 

0x1112 1314 

word 

7 

double 

b; 

r 

0x2122 2324_2526 2728 

doubleword 

7 

char * 

c; 

r 

0x3132_3334 

word 

7 

char 

d[7]; 

r 

'A', 'B', 'C', 'D', 'E', 'F', 'G' 

array of bytes 

7 

short 

e; 

r 

0x5152 

halfword 

7 

int 

f; 

r 

0x6162 6364 

word 

7 


> s; 

Figure 72. C structure 's', showing values of elements 


between d and e, and two bytes between e and f. The 
same amount of padding is present for both Big- 
Endian and Little-Endian mappings. 


D.2.1 Big-Endian mapping 

The Big-Endian mapping of structure s is shown in 
Figure 73. Addresses are shown in hex at the left of 
each doubleword, and in small figures below each 
byte. The content of each byte, as indicated in the C 
example in Figure 72, is shown in hex (as characters 
for the elements of the string). 
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Figure 73. Big-Endian mapping of structure 's' 


D.2.2 Little-Endian mapping 

The same structure s is shown mapped Little-Endian 
in Figure 74. Doublewords are shown laid out from 
right to left, which is the common way of showing 
storage maps for Little-Endian machines. 
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Figure 74. Little-Endian mapping of structure 's' 
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D.3 PowerPC Byte Ordering 

The body of each of the three PowerPC Architecture 
Books, Part 1, “PowerPC User Instruction Set 
Architecture” on page 1, Part 2, “PowerPC Virtual 
Environment Architecture” on page 117, and Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141, are written as if a PowerPC system runs 
only in Big-Endian mode. In fact, a PowerPC system 
can instead run in Little-Endian mode, in which the 
instruction set behaves as if the byte ordering were 
Little-Endian, and can change Endian mode dynam¬ 
ically. The remainder of this appendix describes how 
the mode is controlled, and how running in Little- 
Endian mode differs from running in Big-Endian mode. 


D.3.1 Controlling PowerPC Byte 
Ordering 

The Endian mode of a PowerPC processor is con¬ 
trolled by two bits: the LE (Little-Endian Mode) bit 
specifies the current mode of the processor, and the 
ILE (Interrupt Little-Endian Mode) bit specifies the 
mode that the processor enters when the system 
error handier is invoked. For both bits, a value of 0 
specifies Big-Endian mode and a value of 1 specifies 
Little-Endian mode. The location of these bits and the 
requirements for altering them are described in 
Part 3, “PowerPC Operating Environment 
Architecture" on page 141. 

When a PowerPC system comes up after power-on- 
reset, Big-Endian mode is in effect (see Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141). Thereafter, methods described in Book III 
can be used to change the mode, as can both 
invoking the system error handler and returning from 
the system error handler. 

- Programming Note - 

For a discussion of software synchronization 
requirements when altering the LE and ILE bits, 
please refer to the appendix entitled “Synchroni¬ 
zation Requirements for Special Registers.” 
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D.3.2 PowerPC Little-Endian Byte 
Ordering 

One might expect that a PowerPC system running in 
Little-Endian mode would have to perform a 2-way, 
4-way, or 8-way byte swap when transferring a 
halfword, word, or doubleword to or from storage, 
e.g., when transferring data between storage and a 
general purpose or floating-point register, when 
fetching instructions, and when transferring data 
between storage and an Input/Output (I/O) device. 
PowerPC systems do not do such swapping, but 
instead achieve the effect of Little-Endian byte 
ordering by modifying the low-order three bits of the 
effective address (EA) as described below. Individual 
scalars actually appear in storage in Big-Endian byte 
order. 

The modification affects only the addresses presented 
to the storage subsystem (see Part 3, “PowerPC 
Operating Environment Architecture” on page 141). 
All effective addresses in architecturally defined reg¬ 
isters, as well as the Current Instruction Address 
(CIA) and Next Instruction Address (NIA), are inde¬ 
pendent of Endian mode. For example: 

■ The effective address placed into the Link Reg¬ 
ister by a Branch instruction with LK = 1 is equal 
to the CIA of the Branch instruction + 4; 

■ The effective address placed into RA by a 
Load/Store with Update instruction is the value 
computed as described in the instruction 
description; and 

■ The effective addresses placed into System Reg¬ 
isters when the system error handler is invoked 
(e.g., SRRO, DAR: see Part 3, “PowerPC Oper¬ 
ating Environment Architecture” on page 141) 
are those that were computed or would have 
been computed by the interrupted program. 

The modification is independent of the address trans¬ 
lation mechanism, and thus, e.g., applies regardless 
of whether translation is enabled or disabled, whether 
the accessed storage is in an ordinary storage 
segment, a direct-store segment, or a BAT area, etc. 
(see Part 3, “PowerPC Operating Environment 
Architecture” on page 141). The actual transfer of 
data and instructions to and from storage is unaf¬ 
fected (and thus unencumbered by multiplexors for 
byte swapping). 

The modification of the low-order three bits of the 
effective address in Little-Endian mode is done as 
follows, for access to an individual aligned scalar. 
(Alignment is as determined before this modification.) 
Access to an individual unaligned scalar or to multiple 
scalars is described in subsequent sections, as is 
access to certain architecturally defined data in 
storage, data in caches (see Part 2, “PowerPC Virtual 
Environment Architecture” on page 117, and Part 3, 
“PowerPC Operating Environment Architecture” on 
page 141), etc. 


In Little-Endian mode, the effective address is com¬ 
puted in the same way as in Big-Endian mode. Then, 
in Little-Endian mode only, the low-order three bits of 
the effective address are exclusive-ored with a 
three-bit value that depends on the length of the 
operand (1, 2, 4, or 8 bytes), as shown in Table 11. 
This modified effective address is then passed to the 
storage subsystem, and data of the specified length 
are transferred to or from the addressed (as modified) 
storage locations(s). 


Data length (bytes) 

EA modification: 

1 

XOR with Obi 11 

2 

XOR with Obi 10 

4 

XOR with Obi 00 

8 

(no change) 


Table 11. PowerPC Little-Endian, effective address 
modification for individual aligned scalars 


The effective address modification makes it appear to 
the processor that individual aligned scalars are 
stored Little-Endian, while in fact they are stored Big- 
Endian but in different bytes within doublewords from 
the order in which they are stored in Big-Endian 
mode. 

For example, in Little-Endian mode structure s would 
be placed in storage as follows, from the point of view 
of the storage subsystem (i.e., after the effective 
address modification described above). 


00 





11 

12 

13 

14 


00 

01 

02 

03 

04 

05 

06 

07 

08 

21 

22 

23 

24 

25 

26 

27 

28 


08 

09 

0A 

0B 

oc 

0D 

0E 

OF 

10 

1 D’ 

*C’ 

*B’ 

’A’ 

31 

32 

33 

34 


10 

11 

12 

13 

14 

15 

16 

17 

18 



51 

52 


’G* 

*F* 

’E’ 


18 

19 

1A 

IB 

1C 

ID 

IE 

IF 

20 





61 

62 

63 

64 


20 

21 

22 

23 

24 

25 

26 

27 


Figure 75. PowerPC Little-Endian, structure 's' in 
storage subsystem 

Figure 75 is identical to Figure 74 on page 234 
except that the byte numbers within each doubleword 
are reversed. (This identity is in some sense an 
artifact of depicting storage as a sequence of 
doublewords, if storage is instead depicted as a 
sequence of words, a single byte stream, etc., then no 
such identity appears. However, regardless of the 
unit in which storage is depicted or accessed, the 
address of a given byte in Figure 75 differs from the 
address of the same byte in Figure 74 on page 234 
only in the low-order three bits, and the sum of the 
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two 3-bit values that comprise the low-order three 
bits of the two addresses is equal to 7. Depicting 
storage as a sequence of doublewords makes this 
relationship easy to see.) 

Because of the modification performed on effective 
addresses, structure s appears to the processor to be 
mapped into storage as follows when the processor is 
in Little-Endian mode. 
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18 
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Figure 76. PowerPC Little-Endian, structure 's' as 
seen by processor 

Notice that, as seen by the program executing in the 
processor, the mapping for structure s is identical to 
the Little-Endian mapping shown in Figure 74. From a 
point of view outside the processor, however, the 
addresses of the bytes making up structure s are as 
shown in Figure 75. These addresses match neither 
the Big-Endian mapping of Figure 73 nor the Little- 
Endian mapping of Figure 74; allowance must be 
made for this in certain circumstances (e.g., when 
performing I/O: see Section D.7). 

The following four sections describe in greater detail 
the effects of running in Little-Endian mode on 
accessing data storage, on fetching instructions, on 
explicitly accessing the caches, the Segment 
Lookaside Buffer, and the Translation Lookaside 
Buffer (see Part 2, “PowerPC Virtual Environment 
Architecture” on page 117, and Part 3, “PowerPC 
Operating Environment Architecture” on page 141), 
and on doing I/O. 
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D.4 PowerPC Data Storage 
Addressing in Little-Endian Mode 


D.4.1 Individual Aligned Scalars 

When the storage operand is aligned for any instruc¬ 
tion in the following classes, the effective address 
presented to the storage subsystem is computed as 
described in Section D.3.2: Fixed-Point Load (Section 
3.3.2), Fixed-Point Store (Section 3.3.3), Load and 
Store with Byte Reversal, Storage Synchronization 
(excluding sync), Floating-Point Load, and Floating- 
Point Store (including stfiwx). 

The Load and Store with Byte Reversal instructions 
have the effect of loading or storing data in the oppo¬ 
site Endian mode from that in which the processor is 
running. That is, data are loaded or stored in Little- 
Endian order if the processor is running in Big-Endian 
mode, and in Big-Endian order if the processor is 
running in Little-Endian mode. 


D.4.2 Other Scalars 

As described below, the system alignment error 
handler may be (Section D.4.2.1, “individual Una- 
ligned Scalars” on page 237) or is (Section D.4.2.2, 
“Multiple Scalars” on page 237) invoked if attempt is 
made in Little-Endian mode to execute any of the 
instructions described in the following two sub¬ 
sections. 

- Programming Note - 

It is up to system software whether the system 
alignment error handler, when invoked because of 
attempt to execute any of the instructions 
described in this section when the processor is in 
Little-Endian mode, should emulate the instruction 
and resume the program that made the attempt, 
or should treat the instruction as illegal and termi¬ 
nate the program. 

Little-Endian mode programs on PowerPC are of 
necessity new (not old Power binaries). It is prob¬ 
ably best for the compiler not to generate these 
instructions in Little-Endian mode, since emulation 
would be slower than using a series of aligned 
Load or Store instructions, either in-line or in a 
subroutine. An exception is the case of accessing 
an individual scalar (see Section D.4.2.1) when the 
alignment is not known by the compiler but the 
operand is expected usually to be aligned: in this 
case it may be better for the compiler to generate 
the individual Load or Store instruction, and let 
the system alignment error handler be invoked 
and emulate the instruction if the operand is in 
fact unaligned. 
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D.4.2.1 Individual Unaligned Scalars 

The “trick” of exclusive-oring the low order three bits 
of the effective address of an individual scalar does 
not work unless the scalar is aligned. In Little-Endian 
mode, PowerPC processors may cause the system 
alignment error handler to be invoked whenever any 
of the Load or Store instructions listed in Section 
D.4.1 is issued with an unaligned effective address, 
regardless of whether such an access could be 
handled without invoking the system alignment error 
handler in Big-Endian mode. 

PowerPC processors are not required to invoke the 
system alignment error handler when an unaligned 
access is attempted in Little-Endian mode. The imple¬ 
mentation may handle some or all such accesses 
without invoking the system alignment error handler, 
just as in Big-Endian mode. The architectural require¬ 
ment is that halfwords, words, and doublewords be 
placed in storage such that the Little-Endian effective 
address of the lowest-order byte is the effective 
address computed by the Load or Store instruction, 
the Little-Endian address of the next-lowest-order byte 
is one greater, and so on. ( iwarx, Idarx, stwcx., and 
stdcx. differ somewhat from the rest of the 
instructions listed in Section D.4.1, in that neither the 
implementation nor the system alignment error 
handler is expected to handle these four instructions 
“correctly” if their operands are not aligned.) 

Figure 77 shows an example of a word w stored at 
Little-Endian address 5. The word is assumed to 
contain the binary value 0x11121314. 
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Figure 77. Little-Endian mapping of word 'w' stored 
at address 5 

In Little-Endian mode word w would be placed in 
storage as follows, from the point of view of the 
storage subsystem (i.e., after the effective address 
modification described in Section D.3.2, “PowerPC 
Little-Endian Byte Ordering” on page 235). 
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Figure 78. PowerPC Little-Endian, word 'w' stored at 
address 5, in storage subsystem 

Notice that the unaligned word w in Figure 78 spans 
two doublewords. The two parts of the unaligned 


word are not contiguous as seen by the storage sub¬ 
system. 

An implementation may choose to support some but 
not all unaligned Little-Endian accesses. For example, 
an unaligned Little-Endian access that is contained 
within a single doubleword may be supported, while 
one that spans doublewords may cause the system 
alignment error handler to be invoked. 

D.4.2.2 Multiple Scalars 

PowerPC has two classes of instructions that handle 
multiple scalars, namely the Load and Store Multiple 
instructions and the Move Assist instructions. 
Because both classes of instructions potentially deal 
with more than one word-length scalar, neither class 
is amenable to the effective address modification 
described in Section D.3.2 (e.g., pairs of aligned words 
would be accessed in reverse order from what the 
program would expect). Attempting to execute any of 
these instructions in Little-Endian mode causes the 
system alignment error handler to be invoked. 

D.4.3 Segment Tables and Page 
Tables 

The layout of Segment Tables and Page Tables in 
storage (see Part 3, “PowerPC Operating Environment 
Architecture” on page 141) is independent of Endian 
mode. A given byte in one of these tables must be 
accessed using an effective address appropriate to 
the mode of the executing program (e.g., the high- 
order byte of a Page Table entry must be accessed 
with an effective address ending with ObOOO in Big- 
Endian mode, and with an effective address ending 
with Obi 11 in Little-Endian mode). 

- Engineering Note - 

An implementation that uses software assistance 
to facilitate the hardware's searching and alter¬ 
ation of Segment Tables and/or Page Tables must 
supply two separate software routines, one for 
Big-Endian mode and one for Little-Endian mode. 


D.5 PowerPC Instruction 
Storage Addressing in 
Little-Endian Mode 

Each PowerPC instruction occupies an aligned word in 
storage. The processor fetches and executes 
instructions as if the CIA were advanced by four for 
each sequentially fetched instruction. When the 
processor is in Little-Endian mode, the effective 
address presented to the storage subsystem to fetch 
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an instruction is the value from the CIA, modified as 
described in Section D.3.2 for aligned word-length 
scalars. A Little-Endian program is thus an array of 
aligned Little-Endian words, with each word fetched 
and executed in order (discounting branches and 
invocations of the system error handler). 

Figure 79 shows an example of a small assembly lan¬ 
guage program p. 


cmplwi 

beq 

lwzux 

add 

subi 

b 


r5,0 

done 

r4,r5,r6 

r7,r7,r4 

r5,r5,4 

loop 

r7,total 


Figure 79. Assembly language program 'p' 

The Big-Endian mapping for program p is shown in 
Figure 80 (assuming the program starts at address 0). 
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Figure 80. Big-Endian mapping of program 'p' 

The same program p is shown mapped Little-Endian 
in Figure 81. 


beq done 

07 06 05 04 

add r7,r7,r4 


loop: cmplwi r5,0 
03 02 01 00 

lwzux r4,r5,r6 


OF 0E 0D PC 0B 0A 09 08 

b loop subi rS,r5,4 16 

17 16 15 14 13 12 11 10 

done: stw r7,total 18 

IF IE ID 1C IB 1A 19 18 

Figure 81. Little-Endian mapping of program 'p' 

In Little-Endian mode program p would be placed in 
storage as follows, from the point of view of the 
storage subsystem (i.e., after the effective address 
modification described in Section D.3.2). 
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Figure 82. PowerPC Little-Endian, program 'p' in 
storage subsystem 

Figure 82 is identical to Figure 81 except that the 
byte numbers within each doubleword are reversed. 
(This identity is in some sense an artifact of depicting 
storage as a sequence of doublewords. If storage is 
instead depicted as a sequence of words, a single 
byte stream, etc., then no such identity appears. 
However, regardless of the unit in which storage is 
depicted or accessed, the address of a given byte in 
Figure 82 differs from the address of the same byte 
in Figure 81 only in the low-order three bits, and the 
sum of the two 3-bit values that comprise the low- 
order three bits of the two addresses is equal to 7. 
Depicting storage as a sequence of doublewords 
makes this relationship easy to see.) 

Each individual machine instruction appears in 
storage as a 32-bit integer containing the value 
described in the instruction description, regardless of 
the Endian mode. This is a consequence of the fact 
that individual aligned scalars are mapped in storage 
in Big-Endian byte order. 

Notice that, as seen by the processor when executing 
program p, the mapping for program p is identical to 
the Little-Endian mapping shown in Figure 81. From a 
point of view outside the processor, however, the 
addresses of the bytes making up program p are as 
shown in Figure 82. These addresses match neither 
the Big-Endian mapping of Figure 80 nor the Little- 
Endian mapping of Figure 81. 

AH instruction effective addresses visible to an exe¬ 
cuting program are the effective addresss that are 
computed by that program or, in the case of the 
system error handler, effective addresses that were 
or could have been computed by the interrupted 
program. These effective addresses are independent 
of Endian mode. Examples for Little-Endian mode 
include the following. 

■ An instruction address placed in the Link Register 
by a Branch instruction with LK = 1, or an instruc¬ 
tion address saved in a System Register when 
the system error handler is invoked, is the effec¬ 
tive address that a program executing in Little- 
Endian mode would use to access the instruction 
as a data word using a Load instruction. 

■ An offset in a relative Branch instruction (Branch 
or Branch Conditional with AA = 0) reflects the 
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difference between the addresses of the branch 
and target instructions, using the addresses that 
a program executing in Little-Endian mode would 
use to access the instructions as data words 
using Load instructions. 

■ A target address in an absolute Branch instruc¬ 
tion ( Branch or Branch Conditional with AA = 1) is 
the address that a program executing in Little- 
Endian mode would use to access the target 
instruction as a data word using a Load instruc¬ 
tion. 

■ The storage locations that contain the first set of 
instructions executed by each kind of system 
error handler must be set in a manner consistent 
with the Endian mode in which the system error 
handler will be invoked. (These sets of 
instructions occupy architecturally defined 
locations: see Part 3, “PowerPC Operating Envi¬ 
ronment Architecture” on page 141.) Thus if the 
system error handler is to be invoked in Little- 
Endian mode, the first set of instructions com¬ 
prising each kind of system error handler must 
appear in storage, from the point of view of the 
storage subsystem (i.e., after the effective 
address modification described in Section D.3.2), 
with the pairs of instructions within each 
doubleword reversed from the order in which 
they are to be executed. (If the instructions are 
placed into storage by a program running in the 
same Endian mode as that in which the system 
error handler will be invoked, the approbate 
order will be achieved naturally.) 


D.6 PowerPC Cache 
Management and Lookaside 
Buffer Management Instructions 
in Little-Endian Mode 

The instructions for explicitly accessing the caches, 
Segment Lookaside Buffer, and Translation Lookaside 
Buffer (see Part 2, “PowerPC Virtual Environment 
Architecture” on page 117, and Part 3, “PowerPC 
Operating Environment Architecture” on page 141) 
are unaffected by Endian mode, (identification of the 
block, Segment Table Entry, or Page Table Entry to be 


accessed is not affected by the low-order three bits of 
the effective address.) 

D.7 PowerPC I/O in Little Endian 
Mode 

Input/output (I/O), such as writing the contents of a 
large area of storage to disk, transfers a byte stream 
on both Big-Endian and Little-Endian systems. For the 
disk transfer, the first byte of the area is written to 
the first byte of the disk record and so on. 

For a PowerPC system running in Big-Endian mode, 
I/O transfers happen “naturally” because the byte 
that the processor sees as byte 0 is the same one 
that the storage subsystem sees as byte 0. 

For a PowerPC system running in Little-Endian mode 
this is not the case, because of the modification of the 
low-order three bits of the effective address when the 
processor accesses storage. In order for I/O transfers 
to transfer byte streams properly, in Little-Endian 
mode I/O transfers must be performed as if the bytes 
transferred were accessed one byte at a time, using 
the address modification described in Section D.3.2 
for single-byte scalars. This does not mean that I/O 
on Little-Endian PowerPC systems must use only 
1-byte-wide transfers; data transfers can be as wide 
as desired, but the order of the bytes transferred 
within doublewords must appear as if the bytes were 
fetched or stored one byte at a time. See the System 
Architecture documentation for a given PowerPC 
system for details on the transfer width and byte 
ordering on that system. 

However, not all I/O done on PowerPC systems is for 
large areas of storage as described above. I/O can 
be performed with certain devices merely by storing 
to or loading from addresses that are associated with 
the devices (the terms “memory-mapped I/O” and 
“programmed I/O” or “PIO” are used for this). For 
such PIO transfers, care must be taken when defining 
the addresses to be used, for these addresses are 
subject to the effective address modification shown in 
Table 11 on page 235. A Load or Store instruction 
that maps to a control register on a device may 
require that the value loaded or stored have its bytes 
reversed; if this is required, the Load and Store with 
Byte Reversal instructions can be used. Any require¬ 
ment for such byte reversal for a particular I/O device 
register is independent of whether the PowerPC 
system is running in Big-Endian or Little-Endian mode. 

Similarly, the address sent to an I/O device by an 
eciwx or ecowx instruction (see Part 3, “PowerPC 
Operating Environment Architecture” on page 141) is 
subject to the effective address modification shown in 
Table 11 on page 235. 


— Programming Note - 

In general, a given copy of a subroutine in 
storage cannot be shared between programs 
running in different Endian modes. This affects 
the sharing of subroutine libraries. (It is possible, 
in principle, to write a subroutine that could be 
thus shared — e.g., let every second instruction 
be a no-op — but such a subroutine is not likely to 
be useful in practice.) 
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D.8 Origin of Endian 


The terms Big-Endian and Little-Endian come from 
Part I, Chapter 4, of Jonathan Swift's Gulliver's 
Travels. Here is the complete passage, from the 
edition printed in 1734 by George Faulkner in Dublin. 


... our Histories of six Thousand Moons make 
no Mention of any other Regions, than the 
two great Empires of Lilliput and Blefuscu. 
Which two mighty Powers have, as I was 
going to tell you, been engaged in a most 
obstinate War for six and thirty Moons past. 
It began upon the following Occasion. It is 
allowed on all Hands, that the primitive Way 
of breaking Eggs before we eat them, was 
upon the larger End: But his present Majes¬ 
ty's Grand-father, while he was a Boy, going 
to eat an Egg, and breaking it according to 
the ancient Practice, happened to cut one of 
his Fingers. Whereupon the Emperor his 
Father, published an Edict, commanding all 
his Subjects, upon great Penalties, to break 
the smaller End of their Eggs. The People so 
highly resented this Law, that our Histories 
tell us, there have been six Rebellions raised 
on that Account; wherein one Emperor lost 
his Life, and another his Crown. These civil 
Commotions were constantly fomented by the 
Monarchs of Blefuscu ; and when they were 
quelled, the Exiles always fled for Refuge to 
that Empire. It is computed that eleven 
Thousand Persons have, at several Times, 
suffered Death, rather than submit to break 
their Eggs at the smaller End. Many hundred 
large Volumes have been published upon this 


Controversy: But the Books of the Big- 
Endians have been long forbidden, and the 
whole Party rendered incapable by Law of 
holding Employments. During the Course of 
these Troubles, the Emperors of Blefuscu did 
frequently expostulate by their Ambassadors, 
accusing us of making a Schism in Religion, 
by offending against a fundamental Doctrine 
of our great Prophet Lustrog, in the fifty- 
fourth Chapter of the Brundrecal, (which is 
their Alcoran.) This, however, is thought to 
be a mere Strain upon the text: For the 
Words are these; That all true Believers shall 
break their Eggs at the convenient End: and 
which is the convenient End, seems, in my 
humble Opinion, to be left to every Man's 
Conscience, or at least in the Power of the 
chief Magistrate to determine. Now the Big- 
Endian Exiles have found so much Credit in 
the Emperor of Blefuscu's Court; and so 
much private Assistance and Encouragement 
from their Party here at home, that a bloody 
War has been carried on between the two 
Empires for six and thirty Moons with various 
Success; during which Time we have lost 
Forty Capital Ships, and a much greater 
Number of smaller Vessels, together with 
thirty thousand of our best Seamen and Sol¬ 
diers; and the Damage received by the 
Enemy is reckoned to be somewhat greater 
than ours. However, they have now 
equipped a numerous Fleet, and are just pre¬ 
paring to make a Descent upon us: and his 
Imperial Majesty, placing great Confidence in 
your Valour and Strength, hath commanded 
me to lay this Account of his Affairs before 
you. 
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Appendix E. Programming Examples 


E.1 Synchronization 


This appendix gives examples of how the Storage 
Synchronization instructions can be used to emulate 
various synchronization primitives, and to provide 
more complex forms of synchronization. 

These examples have a common form. After possible 
initialization, there is a “conditional sequence" that: 
begins with a Load And Reserve instruction; may be 
followed by memory accesses and/or computation 
that include neither a Load And Reserve nor a Store 
Conditional ; and ends with a Store Conditional 
instruction with the same target address as the initial 
Load And Reserve. In most of the examples, failure 
of the Store Conditional causes a branch back to the 
Load And Reserve for a repeated attempt. In the 
examples, on the assumption that contention is low, 
the conditional branch is optimized for the case in 
which the stwcx. succeeds, by setting the branch- 
prediction bit appropriately. This is done by 
appending a minus sign to the instruction mnemonic, 
as described in Appendix C.1.4, “Branch prediction” 
on page 224. These examples focus on techniques 
for the correct modification of shared storage 
locations: see Note 4 in Section E.1.4 for a discussion 
of how the retry strategy can affect performance. 

The Load And Reserve and Store Conditional 
instructions depend on the coherence mechanism of 
the system. Stores to a given location are coherent if 
they are serialized in some order, and no processor is 
able to observe a subset of those stores as occurring 
in a conflicting order. See Part 2, “PowerPC Virtual 
Environment Architecture" on page 117, for additional 
details. 

Each load operation, whether ordinary or Load And 
Reserve, returns a value that has a well-defined 
source. The source can be the Store or Store Condi¬ 
tional instruction that wrote the value, an operation 
by some other mechanism that accesses storage 
(e.g., an I/O device), or the initial state of storage. 


The function of an atomic read/modify/write operation 
is to read a location and write its next value, possibly 
as a function of its current value, all as a single 
atomic operation. We assume that locations accessed 
by read/modify/write operations are accessed 
coherently, so the concept of a value being the next 
in the sequence of values for a location is well 
defined. The conditional sequence, as defined above, 
provides the effect of an atomic read/modify/write 
operation, but not with a single atomic instruction. 
Let addr be the location that is the common target of 
the Load And Reserve and Store Conditional 
instructions. Then the guarantee the architecture 
makes for the successful execution of the conditional 
sequence is that no store into addr by another 
processor or mechanism intervened between the 
source of the Load And Reserve and the Store Condi¬ 
tional. 

For each of these examples, it is assumed that a 
similar sequence of instructions is used by all proc¬ 
esses requiring synchronization on the accessed data. 

The examples deal with words: they can be used for 
doublewords by changing all Iwarx instructions to 
Idarx , all stwcx. instructions to stdcx., all sfw 
instructions to std, and all cmpw[i] extended mne¬ 
monics to cmpd[i]. 

- Programming Note - 

Because the Storage Synchronization instructions 
have implementation dependencies (e.g., the 
granularity at which reservations are managed), 
they must be used with care. The operating 
system should provide system library programs 
that use these instructions to implement the high- 
level synchronization functions (Test and Set, 
Compare and Swap, etc.) needed by application 
programs. Application programs should use these 
library programs, rather than use the Storage 
Synchronization instructions directly. 


Appendix E. Programming Examples 241 





E.1.1 Synchronization Primitives 

The following examples show how the Iwarx and 
stwcx. instructions can be used to emulate various 
synchronization primitives. 

The sequences used to emulate the various primitives 
consist primarily of a loop using iwarx and stwcx.. No 
additional synchronization is necessary, because the 
stwcx. will fail, setting the EO bit to 0, if the word 
loaded by Iwarx has changed before the stwcx. is 
executed: see Part 2, “PowerPC Virtual Environment 
Architecture” on page 117 for more detail. 


Fetch and No-op 

The “Fetch and No-op” primitive atomically loads the 
current value in a word in storage. 

In this example it is assumed that the address of the 
word to be loaded is in GPR 3 and the data loaded 
are returned in GPR 4. 

loop: Iwarx r4,0,r3 
stwcx. r4,0,r3 

bne- loop 

Notes: 

1. The stwcx., if it succeeds, stores to the target 
location the same value that was loaded by the 
preceding Iwarx. While the store is redundant 
with respect to the value in the location, its 
success ensures that the value loaded by the 
Iwarx was the current value, i.e., that the source 
of the value loaded by the Iwarx was the last 
store to the location that preceded the stwcx. in 
the coherence order for the location. 


#load and reserve 
#store old value if 
# still reserved 
#loop if lost reserv'n 


Fetch and Store 

The “Fetch and Store” primitive atomically loads and 
replaces a word in storage. 

In this example it is assumed that the address of the 
word to be loaded and replaced is in GPR 3, the new 
value is in GPR 4, and the old value is returned in 
GPR 5. 

loop: Iwarx r5,0,r3 #load and reserve 
stwcx. r4,0,r3 #store new value if 
# still reserved 

bne- loop #loop if lost reserv'n 


loop: Iwarx r5,0,r3 
add r0,r4,r5 
stwcx. r0,0,r3 

bne- loop 


#load and reserve 
increment word 
#store new value if 
# still reserved 
#loop if lost reserv'n 


Fetch and AND 


The “Fetch and AND” primitive atomically ANDs a 
value into a word in storage. 


In this example it is assumed that the address of the 
word to be ANDed is in GPR 3, the value to AND into 
it is in GPR 4, and the old value is returned in GPR 5. 


loop: Iwarx r5,0,r3 
and r0,r4,r5 
stwcx. r0,0,r3 

bne- loop 


#load and reserve 
#AND word 

#store new value if 
# still reserved 
#loop if lost reserv'n 


Notes: 


1. The sequence given above can be changed to 
perform another Boolean operation atomically on 
a word in storage, simply by changing the and 
instruction to the desired Boolean instruction (or, 
xor, etc.). 


Test and Set 

The “Test and Set” primitive atomically loads a word 
from storage, ensures that the word in storage con¬ 
tains a non-zero value, and sets the EO bit of CR Field 
0 according to whether the value loaded is zero. 

In this example it is assumed that the address of the 
word to be tested is in GPR 3, the new value (non¬ 
zero) is in GPR 4, and the old value is returned in 
GPR 5. 

loop: Iwarx r5,0,r3 
cmpwi r5,0 
bne- $+12 
stwcx. r4,0,r3 
bne- loop 

Notes: 

1. Depending on the application, if Test and Set fails 
(i.e., sets the EO bit of CR Field 0 to zero) it may 
be appropriate to re-execute the Test and Set. 


#load and reserve 
#done if word 
# not equal to 0 
#try to store non-0 
#loop if lost reserv'n 


Compare and Swap 


Fetch and Add 

The “Fetch and Add” primitive atomically increments 
a word in storage. 

In this example it is assumed that the address of the 
word to be incremented is in GPR 3, the increment is 
in GPR 4, and the old value is returned in GPR 5. 


The “Compare and Swap” primitive atomically com¬ 
pares a value in a register with a word in storage, if 
they are equal stores the value from a second reg¬ 
ister into the word in storage, if they are unequal 
loads the word from storage into the first register, 
and sets the EO bit of CR Field 0 to indicate the result 
of the comparison. 

In this example it is assumed that the address of the 
word to be tested is in GPR 3, the comparand is in 
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GPR 4 and the old value is returned there, and the 
new value is in GPR 5. 


E.1.2 Lock Acquisition and Release 


loop: Iwarx r6,8,r3 #load and reserve 

cmpw r4,r6 #lst 2 operands equal? 

bne- exit #skip if not 

stwcx. r5,0,r3 #store new value if 

# still reserved 

bne- loop #loop if lost reserv'n 

exit: mr r4,r6 #return value from storage 

Notes: 

1. The semantics given for “Compare and Swap” 
above are based on those of the IBM System/370 
Compare and Swap instruction. Other architec¬ 
tures may define a Compare and Swap instruction 
differently. 

2. “Compare and Swap” is shown primarily for ped¬ 
agogical reasons, it is useful on machines that 
lack the better synchronization facilities provided 
by Iwarx and stwcx.. A major weakness of a 
System/370-style Compare and Swap instruction 
is that, although the instruction itself is atomic, it 
checks only that the old and current values of the 
word being tested are equal, with the result that 
programs that use such a Compare and Swap to 
control a shared resource can err if the word has 
been modified and the old value subsequently 
restored. The sequence shown above has the 
same weakness. 

3. In some applications the second bne- instruction 
and/or the mr instruction can be omitted. The 
bne - is needed only if the application requires 
that if the EQ bit of CR Field 0 on exit indicates 
“not equal” then (r4) and (r6) are in fact not 
equal. The mr is needed only if the application 
requires that if the comparands are not equal 
then the word from storage is loaded into the reg¬ 
ister with which it was compared (rather than into 
a third register). If either or both of these 
instructions is omitted, the resulting Compare and 
Swap does not obey System/370 semantics. 

4. Depending on the application, if Compare and 
Swap fails (i.e., sets the EO bit of CR Field 0 to 
zero) it may be appropriate to recompute the 
value potentially to be stored and then reexecute 
the Compare and Swap. 


This example gives an algorithm for locking that dem¬ 
onstrates the use of synchronization with an atomic 
read/modify/write operation. A shared storage 
location, the address of which is an argument of the 
“lock” and “unlock” procedures, given by GPR 3, is 
used as a lock, to control access to some shared 
resource such as a shared data structure. The lock is 
open when its value is 0, and closed (locked) when its 
value is 1. Before accessing the shared resource, a 
processor sets the lock, by changing its value from 0 
to 1. To do this, the “lock” procedure calls 
testjandjset, which executes the code sequence 
shown in the “Test and Set” example of Section E.1.1, 
thereby atomically loading the old value of the lock, 
writing to the lock the new value (1) given in GPR 4, 
returning the old value in GPR 5 (not used below), and 
setting the EO bit of CR Field 0 according to whether 
the value loaded is zero. The “lock” procedure 
repeats the test_and_set until it succeeds in changing 
the value of the lock from 0 to 1. 

The processor must not access the shared resource 
until it sets the lock. After the bne- that checks for 
the success of test_and_set, the processor executes 
an isync instruction (see Part 2, “PowerPC Virtual 
Environment Architecture” on page 117). This delays 
all subsequent instructions until all previous 
instructions have completed to the extent required by 
context synchronization (see Part 3, “PowerPC Oper¬ 
ating Environment Architecture" on page 141). sync 
could be used, but performance would be degraded 
unnecessarily because sync waits for all prior storage 
accesses to complete with respect to all other 
processors, which is not necessary here. 

lock: li r4,l #obtain lock: 

loop: bl test_and_set # test-and-set 

bne- loop # retry til old = 0 

# Delay subsequent inst'ns til prior inst'ns finish 
isync 

blr #return 

The “unlock” procedure writes a 0 to the lock 

location. Most applications that use locking require, 
for correctness, that if the access to the shared 
resource included write operations, the processor 
must execute a sync instruction to make its modifica¬ 
tions visible to all processors before releasing the 
lock. In this example, the “unlock” procedure begins 
with a sync for this purpose. 

unlock: sync #delay til prior stores finish 

li rl,0 #store zero to lock location 

stw rl,0(r3) 
blr ^return 
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E.1.3 List Insertion 


E.1.4 Notes 


This example shows how the Iwarx and stwcx. 
instructions can be used to implement simple 
insertion into a singly linked list. (Complicated list 
insertion, in which multiple values must be changed 
atomically, or in which the correct order of insertion 
depends on the contents of the elements, cannot be 
implemented in the manner shown below, and 
requires a more complicated strategy such as using 
locks.) 

The “next element pointer” from the list element after 
which the new element is to be inserted, here called 
the "parent element,” is stored into the new element, 
so that the new element points to the next element in 
the list: this store is performed unconditionally. Then 
the address of the new element is conditionally stored 
into the parent element, thereby adding the new 
element to the list. 


In this example it is assumed that the address of the 
parent element is in GPR 3, the address of the new 
element is in GPR 4, and the next element pointer is 
at offset 0 from the start of the element. It is also 
assumed that the next element pointer of each list 
element is in a “reservation granule” separate from 
that of the next element pointer of all other list ele¬ 
ments: see Part 2, “PowerPC Virtual Environment 
Architecture” on page 117. 


loop: Iwarx r2,0,r3 
stw r2,0(r4) 
sync 

stwcx. r4,0,r3 
bne- loop 


#get next pointer 
#store in new element 
#let store settle (can 
# omit if not MP) 

#add new element to list 
#loop if stwcx. failed 


In the preceding example, if two list elements have 
next element pointers in the same reservation 
granule then, in a multiprocessor, “livelock” can 
occur. (Livelock is a state in which processors 
interact in a way such that no processor makes 
progress.) 


If it is not possible to allocate list elements such that 
each element's next element pointer is in a different 
reservation granule, then livelock can be avoided by 
using the following, more complicated, sequence. 


lwz 

r2,0(r3) 

loopl: mr 

r5,r2 

stw 

r2,0(r4) 

sync 


loop2: Iwarx 

r2,0,r3 

cmpw 

r2,r5 

bne- 

loopl 

stwcx. 

r4,0,r3 

bne- 

loop2 


#get next pointer 

#keep a copy 

#store in new element 

#let store settle 

#get it again 

#loop if changed (someone 

# else progressed) 

#add new element to list 
#loop if failed 


1. In general, Iwarx and stwcx. instructions should 
be paired, with the same effective address used 
for both. The exception is an isolated stwcx. 
instruction that is used to clear any existing res¬ 
ervation on the processor, for which there is no 
paired Iwarx and for which any (scratch) effective 
address can be used. 

2. It is acceptable to execute a Iwarx instruction for 
which no stwcx. instruction is executed. For 
example, such a “dangling Iwarx” occurs if the 
value loaded in the “Test and Set” sequence 
shown above is not zero. 

3. To increase the likelihood that forward progress 
is made, it is important that looping on 
Iwarx/stwcx. pairs be minimized. For example, in 
the sequence shown above for “Test and Set,” 
this is achieved by testing the old value before 
attempting the store: were the order reversed, 
more stwcx. instructions might be executed, and 
reservations might more often be lost between 
the Iwarx and the stwcx.. 


4. The manner in which Iwarx and stwcx. are com¬ 
municated to other processors and mechanisms, 
and between levels of the storage subsystem 
within a given processor (see Part 2, “PowerPC 
Virtual Environment Architecture" on page 117), 
is implementation-dependent. In some implemen¬ 
tations performance may be improved by mini¬ 
mizing looping on a Iwarx instruction that fails to 
return a desired value. For example, in the “Test 
and Set” example shown above, if the pro¬ 
grammer wishes to stay in the loop until the word 
loaded is zero, he could change the “bne- $+12” 
to “bne- loop.” However, in some implementa¬ 
tions better performance may be obtained by 
using an ordinary Load instruction to do the initial 
checking of the value, as follows. 


loop: lwz r5,0(r3) 
cmpwi r5,0 
bne- loop 
Iwarx r5,0,r3 
cmpwi r5,0 
bne- loop 
stwcx. r4,0,r3 
bne- loop 


#load the word 
#loop back if word 
# not equal to 0 
#try again, reserving 
ft (likely to succeed) 

#try to store non-0 
#loop if lost reserv'n 


5. In a multiprocessor, livelock is possible if a loop 
containing a Iwarx/stwcx. pair also contains an 
ordinary Store instruction for which any byte of 
the affected storage area is in the reservation 
granule of the reservation: see Part 2, “PowerPC 
Virtual Environment Architecture” on page 117. 
For example, the first code sequence shown in 
Section E.1.3, List Insertion, can cause livelock if 
two list elements have next element pointers in 
the same reservation granule. 
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E.2 Multiple-Precision Shifts 


This appendix gives examples of how multiple- 
precision shifts can be programmed. 

A multiple-precision shift is initially defined to be a 
shift of an N-doubleword quantity (64-bit mode) or an 
N-word quantity (32-bit mode), where N>1. (This defi¬ 
nition is relaxed somewhat for 32-bit mode, below.) 
The quantity to be shifted is contained in N registers 
(in the low-order 32 bits in 32-bit mode). The shift 
amount is specified either by an immediate value in 
the instruction, or by bits 57:63 (64-bit mode) or 58:63 
(32-bit mode) of a register. 

The examples shown below distinguish between the 
cases N = 2 and N>2. If N = 2, the shift amount may be 
in the range 0 through 127 (64-bit mode) or 0 through 
63 (32-bit mode), which are the maximum ranges sup¬ 
ported by the Shift instructions used. However if 
N>2, the shift amount must be in the range 0 through 
63 (64-bit mode) or 0 through 31 (32-bit mode), in 
order for the examples to yield the desired result. 
The specific instance shown for N>2 is N = 3: 
extending those code sequences to larger N is 
straightforward, as is reducing them to the case N = 2 


when the more stringent restriction on shift amount is 
met. For shifts with immediate shift amounts only the 
case N = 3 is shown, because the more stringent 
restriction on shift amount is always met. 

In the examples it is assumed that GPRs 2 and 3 (and 
4) contain the quantity to be shifted, and that the 
result is to be placed into the same registers, except 
for the immediate left shifts in 64-bit mode for which 
the result is placed into GPRs 3, 4, and 5. In all 
cases, for both input and result, the lowest-numbered 
register contains the highest-order part of the data 
and highest-numbered register contains the lowest- 
order part. In 32-bit mode, the high-order 32 bits of 
these registers are assumed not to be part of the 
quantity to be shifted nor of the result. For non- 
immediate shifts, the shift amount is assumed to be in 
bits 57:63 (64-bit mode) or 58:63 (32-bit mode) of GPR 
6. For immediate shifts, the shift amount is assumed 
to be greater than 0. GPRs 0 and 31 are used as 
scratch registers. 

For N>2, the number of instructions required is 2N—1 
(immediate shifts) or 3N—1 (non-immediate shifts). 


Multiple-precision shifts in 64-bit mode Multiple-precision shifts in 32-bit mode 


Shift Left Immediate, N = 3 (shift amnt < 64) Shift Left Immediate, N = 3 (shift amnt < 32) 


rldicr 

r5,r4,sh,63-sh 

rlwinm 

r2,r2,sh,0,31-sh 

rldimi 

r4,r3,0,sh 

rlwimi 

r2,r3,sh,32-sh,31 

rldicl 

r4,r4,sh,0 

rlwinm 

r3,r3,sh,0,31-sh 

rldimi 

r3,r2,0,sh 

rlwimi 

r3,r4,sh,32-sh,31 

rldicl 

r3,r3,sh,0 

rlwinm 

r4,r4,sh,0,31-sh 

Left, N = 

2 (shift amnt < 128) 

Shift Left, N = 

2 (shift amnt < 64) 

subfic 

r31,r6,64 

subfic 

r31,r6,32 

sld 

r2,r2,r6 

slw 

r2,r2,r6 

srd 

r0,r3,r31 

srw 

r0,r3,r31 

or 

r2,r2,r0 

or 

r2,r2,r0 

addic 

r31,r6,-64 

addic 

r31,r6,-32 

sld 

r0,r3,r31 

slw 

r0,r3,r31 

or 

r2,r2,r0 

or 

r2,r2,r0 

sld 

r3,r3,r6 

slw 

r3,r3,r6 
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Multiple-precision shifts in 64-bit mode, 
continued 

Multiple-precision shifts in 32-bit mode, 
continued 

Shift Left, N = 

= 3 (shift amnt < 64) 

Shift Left, N = 

= 3 (shift amnt < 32) 

subfic 

r31.r6.64 

subfic 

r31,r6,32 

sld 

r2,r2,r6 

slw 

r2.r2,r6 

srd 

r0,r3,r31 

srw 

r0,r3,r31 

or 

r2,r2,r0 

or 

r2,r2,r0 

sld 

r3,r3,r6 

slw 

r3,r3,r6 

srd 

r0,r4,r31 

srw 

r0,r4,r31 

or 

r3,r3,r0 

or 

r3,r3,r0 

sld 

r4,r4,r6 

slw 

r4,r4,r6 

Shift Right Immediate, N = 3 (shift amnt < 64) 

Shift Right Immediate, N = 3 (shift amnt < 32) 

rldimi 

r4,r3,0,64-sh 

rlwinm 

r4,r4,32-sh,sh,31 

rldicl 

r4,r4,64-sh,0 

rlwimi 

r4,r3,32-sh,0,sh-1 

rldimi 

r3,r2,0,64-sh 

rlwinm 

r3,r3,32-sh,sh,31 

rldicl 

r3,r3,64-sh,0 

rlwimi 

r3,r2,32-sh,0,sh-1 

rldicl 

r2,r2,64-sh,sh 

rlwinm 

r2,r2,32-sh,sh,31 

Shift Right, N 

= 2 (shift amnt < 128) 

Shift Right, N 

= 2 (shift amnt < 64) 

subfic 

r31,r6,64 

subfic 

r31,r6,32 

srd 

r3,r3,r6 

srw 

r3,r3,r6 

sld 

r0,r2,r31 

slw 

r0,r2,r31 

or 

r3,r3,r0 

or 

r3,r3,r0 

addic 

r31,r6,-64 

addic 

r31 ,r6,-32 

srd 

r0,r2,r31 

srw 

r0,r2,r31 

or 

r3,r3,r0 

or 

r3,r3,r0 

srd 

r2,r2,r6 

srw 

r2,r2,r6 

Shift Right, N 

= 3 (shift amnt < 64) 

Shift Right, N 

= 3 (shift amnt < 32) 

subfic 

r31,r6,64 

subfic 

r31,r6,32 

srd 

r4,r4,r6 

srw 

r4,r4,r6 

sld 

r0,r3,r31 

slw 

r0,r3,r31 

or 

r4,r4,r0 

or 

r4,r4,r0 

srd 

r3,r3,r6 

srw 

r3,r3,r6 

sld 

r0,r2,r31 

slw 

r0,r2,r31 

or 

r3,r3,r0 

or 

r3,r3,r0 

srd 

r2,r2,r6 

srw 

r2,r2,r6 

Shift Right Algebraic Immediate, N = 3 (shift amnt < 64) 

Shift Right Algebraic Immediate, N = 3 (shift amnt < 32) 

rldimi 

r4,r3,0,64-sh 

rlwinm 

r4,r4,32-sh,sh,31 

rldicl 

r4,r4,64-sh,0 

rlwimi 

r4,r3,32-sh,0,sh-1 

rldimi 

r3,r2,0,64-sh 

rlwinm 

r3,r3,32-sh,sh,31 

rldicl 

r3,r3,64-sh,0 

rlwimi 

r3,r2,32-sh,0,sh-1 

sradi 

r2,r2,sh 

srawi 

r2,r2,sh 

Shift Right Algebraic, N = 2 (shift amnt < 128) 

Shift Right Algebraic, N = 2 (shift amnt < 64) 

subfic 

r31,r6,64 

subfic 

r31,r6,32 

srd 

r3,r3,r6 

srw 

r3,r3,r6 

sld 

r0,r2,r31 

slw 

r0,r2,r31 

or 

r3,r3,r0 

or 

r3,r3,r0 

addic. 

r31,r6,-64 

addic. 

r31,r6,-32 

srad 

r0,r2,r31 

sraw 

r0.r2.r31 

ble 

$ + 8 

ble 

$ + 8 

ori 

r3,r0,0 

ori 

r3,r0,0 

srad 

r2,r2,r6 

sraw 

r2,r2,r6 
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Multiple-precision shifts in 64-bit mode, 
continued 

Shift Right Algebraic, N = 3 (shift amnt < 64) 


subfic 

r31,r6,64 

srd 

r4,r4,r6 

sld 

r0,r3,r31 

or 

r4,r4,r0 

srd 

r3,r3,r6 

sld 

r0,r2,r31 

or 

r3,r3,r0 

srad 

r2,r2,r6 


Multiple-precision shifts in 32-bit mode, 
continued 


Shift Right Algebraic, N = 3 (shift amnt < 32) 


subfic 

r31,r6,32 

srw 

r4,r4,r6 

slw 

r0,r3,r31 

or 

r4,r4,r0 

srw 

r3,r3,r6 

slw 

r0,r2,r31 

or 

r3,r3,r0 

sraw 

r2,r2,r6 


The examples shown above for 32-bit mode work both 
in 32-bit mode of a 64-bit implementation and in a 
32-bit implementation. They perform the shift in units 
of words. If ability to run in 32-bit implementations is 
not required, in a 64-bit implementation better per¬ 
formance can be obtained in 32-bit mode than that of 
the examples shown above, by using all 64 bits of 
GPRs 2 and 3 (and 4) to contain the quantity to be 
shifted, and placing the result into all 64 bits of the 
same registers. 


Let N be the number of doublewords to be shifted. 

The examples shown above for 64-bit mode work 
equally well in 32-bit mode of a 64-bit implementation, 
using ait 64 bits of the registers. For N>2, the 
number of instructions required is 2N—1 (immediate 
shifts) or 3N—1 (non-immediate shifts), compared with 
4N—1 (immediate shifts) or 6N—1 (non-immediate 
shifts) for the examples shown above for 32-bit mode. 
(The examples shown above require using twice as 
many registers to hold the quantity to be shifted.) 
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E.3 Floating-Point Conversions 


This appendix gives examples of how the Floating- 
Point Conversion instructions can be used to perform 
various conversions. 


Warning: Some of the examples use the fee/ instruc¬ 
tion. Care must be taken in using fee/ if IEEE compat¬ 
ibility is required, or if the values being tested can be 
NaNs or infinities: see Section E.4.4, “Notes” on 
page 251. 


E.3.1 Conversion from 
Floating-Point Number to 
Floating-Point Integer 

In a 64-bit Implementation 

The full convert to floating-point integer function can 
be implemented with the sequence shown below, 
assuming the floating-point value to be converted is 
in FPR 1, and the result is returned in FPR 3. 

mtfsbQ 23 #clear VXCVI 

fctid[z] f3,fl fconvert to fx int 

fcfid f3,f3 #convert back again 

mcrfs 7,5 #VXCVI to CR 

bf 31,$+8 #skip if VXCVI was 0 

fmr f3,fl #input was fp int 

E.3.2 Conversion from Floating-Point 
Number to Signed Fixed-Point Integer 
Doubleword 

This example applies to 64-bit implementations only. 

The full convert to signed fixed-point integer 
doubleword function can be implemented with the 
sequence shown below, assuming the floating-point 
value to be converted is in FPR 1, the result is 
returned in GPR 3, and a doubleword at displacement 
“disp” from the address in GPR 1 can be used as 
scratch space. 

fetid[z] f2,fl fconvert to dword int 

stfd f2,disp(rl) #store float 

Id r3,disp(rl) #load dword 


E.3.3 Conversion from Floating-Point 
Number to Unsigned Fixed-Point 
Integer Doubleword 

This example applies to 64-bit implementations only. 

The full convert to unsigned fixed-point integer 
doubleword function can be implemented with the 
sequence shown below, assuming the floating-point 
value to be converted is in FPR 1, the value 0 is in 
FPR 0, the value 2 64 —2048 is in FPR 3, the value 2 s3 is 
in FPR 4 and GPR 4, the result is returned in GPR 3, 
and a doubleword at displacement “disp” from the 
address in GPR 1 can be used as scratch space. 

fsel f2,fl,fl,f0 #use 0 if < 0 

fsub f5,f3,fl #use max if > max 

fsel f2,f5,f2,f3 

fsub f5,f2,f4 #subtract 2**63 

fempu cr2,f2,f4 #use diff if > 2**63 

fsel f2,f5,f5,f2 

fctid[z] f2,f2 #convert to fx int 

stfd f2,disp(rl) #store float 

Id r3,disp(rl) #load dword 

bit cr2,$+8 #add 2**63 if input 

add r3,r3,r4 # was a 2**63 

E.3.4 Conversion from Floating-Point 
Number to Signed Fixed-Point Integer 
Word 

The full convert to signed fixed-point integer word 
function can be implemented with the sequence 
shown below, assuming the floating-point value to be 
converted is in FPR 1, the result is returned in GPR 3, 
and a doubleword at displacement “disp” from the 
address in GPR 1 can be used as scratch space. The 
last instruction is needed only if a 64-bit result is 
required, and applies to 64-bit implementations only. 

fctiw[z] f2,fl #convert to fx int 

stfd f2,disp(rl) fstore float 

lwz r3,disp+4(rl) #load word and zero 

extsw r3,r3 #(for 64-bit result) 
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E.3.5 Conversion from Floating-Point E.3.6 Conversion from Signed 
Number to Unsigned Fixed-Point Fixed-Point Integer Doubleword to 

Integer Word Floating-Point Number 


In a 64-bit Implementation 

The full convert to unsigned fixed-point integer word 
function can be implemented with the sequence 
shown below, assuming the floating-point value to be 
converted is in FPR 1, the value 0 is in FPR 0, the 
value 2 32 —1 is in FPR 3, the result is returned in GPR 
3, and a doubleword at displacement “disp” from the 
address in GPR 1 can be used as scratch space. 

fsel f2,fl,fl,f0 #use 0 if < 0 

fsub f4,f3,fl #use max if > max 

fsel f2,f4,f2,f3 

fetid[z] f2,f2 #convert to fx int 

stfd f2,disp(rl) #store float 

Iwz r3,disp+4(rl) #load word and zero 

In a 32-bit Implementation 

The full convert to unsigned fixed-point integer word 
function can be implemented with the sequence 
shown below, assuming the floating-point value to be 
converted is in FPR 1, the value 0 is in FPR 0, the 
value 2 32 is in FPR 3, the value 2 31 is in FPR 4 and 
GPR 4, the result is returned in GPR 3, and a 
doubleword at displacement “disp” from the address 
in GPR 1 can be used as scratch space. 

fsel f2,fl,fl,f0 #use 0 if < 0 

fsub f5,f3,fl fuse max if > max 

fsel f2,f5,f2,f3 

fsub f5,f2,f4 fsubtract 2**31 

fempu cr2,f2,f4 #use diff if > 2**31 

fsel f2,f5,f5,f2 

fctiw[z] f2,f2 #convert to fx int 

stfd f2,disp(rl) #store float 

Iwz r3,disp+4(rl) #load word 

bit cr2,$+8 #add 2**31 if input 

add r3,r3,r4 # was > 2**31 


This example applies to 64-bit implementations only. 

The full convert from signed fixed-point integer 
doubleword function, using the rounding mode speci¬ 
fied by FPSCR rn , can be implemented with the 
sequence shown below, assuming the fixed-point 
value to be converted is in GPR 3, the result is 
returned in FPR 1, and a doubleword at displacement 
“disp” from the address in GPR 1 can be used as 
scratch space. 

std r3,disp(rl) #store dword 

Ifd fl,disp(rl) #load float 

fefid fl,fl #convert to fpu int 

E.3.7 Conversion from Unsigned 
Fixed-Point Integer Doubleword to 
Floating-Point Number 

This example applies to 64-bit implementations only. 

The full convert from unsigned fixed-point integer 
doubleword function, using the rounding mode speci¬ 
fied by FPSCR rn , can be implemented with the 
sequence shown below, assuming the fixed-point 
value to be converted is in GPR 3, the value 2 32 is in 
FPR 4, the result is returned in FPR 1, and two 
doublewords at displacement “disp” from the address 
in GPR 1 can be used as scratch space. 


rldicl 

r2,r3,32,32 

#isol ate high half 

rldicl 

r0,r3,0,32 

#isolate low half 

std 

r2,disp(rl) 

#store dword both 

std 

r0,disp+8(rl) 


Ifd 

f2,disp(rl) 

#load float both 

Ifd 

fl,disp+8(rl) #load float both 

fefid 

f2,f2 

#convert each half i 

fefid 

fl,fl 

# fpu int (no rnd) 

fmadd 

fl,f4,f2,fl 

#(2**32)*high + low 


# (only add can rnd) 


An alternative, shorter, sequence can be used if 
rounding according to FSCPR rn is desired and 
FPSCR rn specifies Round toward +Infinity or Round 
toward —Infinity, or if it is acceptable for a rounded 
answer to be either of the two representable floating¬ 
point integers nearest algebraically to the given fixed- 
point integer. In this case the full convert from 
unsigned fixed-point integer doubleword function can 
be implemented with the sequence shown below, 
assuming the value 2 s4 is in FPR 2. 


std 

r3,disp(rl) 

#store dword 

Ifd 

fl,disp(rl) 

#load float 

fefid 

fl.fl 

fconvert to fpu int 

fadd 

f4,fl,f2 

fadd 2**64 

fsel 

fl,fl,fl,f4 

# if r3 < 0 
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E.3.8 Conversion from Signed 
Fixed-Point Integer Word to 
Floating-Point Number 

In a 64-bit Implementation 

The full convert from signed fixed-point integer word 
function can be implemented with the sequence 
shown below, assuming the fixed-point value to be 
converted is in GPR 3, the result is returned in FPR 1, 
and a doubleword at displacement “disp” from the 
address in GPR 1 can be used as scratch space. 
(Rounding cannot occur.) 


extsw 

r3,r3 

#extend sign 

std 

r3,disp(rl) 

#store dword 

lfd 

fl,disp(rl) 

#load float 

fcfid 

fl.fl 

#convert to fpu int 


E.3.9 Conversion from Unsigned 
Fixed-Point Integer Word to 
Floating-Point Number 

In a 64-bit Implementation 


The full convert from unsigned fixed-point integer 
word function can be implemented with the sequence 
shown below, assuming the fixed-point value to be 
converted is in GPR 3, the result is returned in FPR 1, 
and a doubleword at displacement “disp” from the 
address in GPR 1 can be used as scratch space. 
(Rounding cannot occur.) 


rldicl r0,r3,0,32 
std r0,disp(rl) 
lfd fl,disp(rl) 
fcfid fl,fl 


#zero-extend 
#store dword 
#1oad float 
#convert to fpu int 
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E.4 Floating-Point Selection 


This appendix gives examples of how the Floating 
Select instruction can be used to implement floating¬ 
point minimum and maximum functions, and certain 
simple forms of if-then-else constructions, without 
branching. 

The examples show program fragments in an imagi¬ 
nary, C-like, high-level programming language, and 
the corresponding program fragment using fsel and 
other PowerPC instructions. In the examples, a, b, x. 


y, and z are floating-point variables, which are 
assumed to be in FPRs fa, fb, fx, fy, and fz. FPR fs is 
assumed to be available for scratch space. 

Additional examples can be found in Section E.3, 
“Floating-Point Conversions” on page 248. 

Warning: Care must be taken in using fee/ if IEEE 
compatibility is required, or if the values being tested 
can be NaNs or infinities: see Section E.4.4, “Notes.” 


E.4.1 Comparison to Zero 


High-level language: 

PowerPC: 

Notes 

if a > 0.0 then x «- y 
else x «- z 

fsel 

fx,fa,fy,fz 

(1) 

if a > 0.0 then x «- y 
else x «- z 

fneg 

fsel 

fs,fa 

fx,fs,fz,fy 

(1,2) 

if a = 0.0 then x «- y 
else x «- z 

fsel 

fneg 

fsel 

fx,fa,fy,fz 

fs,fa 

fx,fs,fx,fz 

(1) 

E.4.2 Minimum and Maximum 


High-level language: 

PowerPC: 

Notes 

x «• min(a,b) 

fsub 

fsel 

fs,fa,fb 

fx,fs,fb,fa 

(3,4,5) 

x <- max(a,b) 

fsub 

fsel 

fs,fa,fb 

fx,fs,fa,fb 

(3,4,5) 

E.4.3 Simple if-then-else 
Constructions 


High-level language: 

PowerPC: 

Notes 

if a > b then x «- y 
else x «- z 

fsub 

fsel 

fs,fa,fb 

fx,fs,fy,fz 

(4,5) 

if a > b then x «- y 
else x <- z 

fsub 

fsel 

fs,fb,fa 

fx,fs,fz,fy 

(3,4,5) 

if a = b then x «- y 
else x «- z 

fsub 

fsel 

fneg 

fsel 

fs,fa,fb 

fx,fs,fy,fz 

fs,fs 

fx,fs,fx,fz 

(4,5) 


E.4.4 Notes 

The following Notes apply to the preceding examples, 
and to the corresponding cases using the other three 
arithmetic relations (<, <, and &). They should also 
be considered when any other use of fee/ is contem¬ 
plated. 

In these Notes, the “optimized program” is the 
PowerPC program shown, and the “unoptimized 
program” (not shown) is the corresponding PowerPC 
program that uses fcmpu and Branch Conditional 
instructions instead of fee/. 

1. The unoptimized program affects the VXSNAN bit 
of the FPSCR, and therefore may cause the 
system error handler to be invoked if the corre¬ 
sponding exception is enabled, while the opti¬ 
mized program does not affect this bit. This is 
incompatible with the IEEE standard. 

2. The optimized program gives the incorrect result 
if a is a NaN. 

3. The optimized program gives the incorrect result 
if a and/or b is a NaN (except that it may give the 
correct result in some cases for the minimum and 
maximum functions, depending on how those 
functions are defined to operate on NaNs). 

4. The optimized program gives the incorrect result 
if a and b are infinities of the same sign. (Here it 
is assumed that Invalid Operation Exceptions are 
disabled, in which case the result of the sub¬ 
traction is a NaN. The analysis is more compli¬ 
cated if Invalid Operation Exceptions are enabled, 
because in that case the target register of the 
subtraction is unchanged.) 

5. The optimized program affects the OX, UX, XX, 
and VXISI bits of the FPSCR, and therefore may 
cause the system error handler to be invoked if 
the corresponding exceptions are enabled, while 
the unoptimized program does not affect these 
bits. This is incompatible with the IEEE standard. 
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Appendix F. Cross-Reference for Changed Power Mnemonics 


The following table lists the Power instruction mne¬ 
monics that have been changed in the PowerPC Archi¬ 
tecture, sorted by Power mnemonic. 

To determine the PowerPC mnemonic for one of these 
Power mnemonics, find the Power mnemonic in the 
second column of the table: the remainder of the line 
gives the PowerPC mnemonic and the page or Book in 
which the instruction is described, as well as the 
instruction names. A page number is shown for 
instructions that are defined in this Book (Part 1, 
“PowerPC User Instruction Set Architecture” on 


page 1), and the Book number is shown for 
instructions that are defined in other Books (Part 2, 
“PowerPC Virtual Environment Architecture” on 
page 117, and Part 3, “PowerPC Operating Environ¬ 
ment Architecture” on page 141). If an instruction is 
defined in more than one Book, the lowest-numbered 
Book is used. 

Power mnemonics that have not changed are not 
listed. Power instruction names that are the same in 
PowerPC are not repeated: i.e., for these, the last 
column of the table is blank. 


Page / 

Power 

PowerPC 

Bk 

Mnemonic 

Instruction 

Mnemonic 

Instruction 

52 

a[o][.] 

Add 

addc[o][.] 

Add Carrying 

53 

ae[o][.] 

Add Extended 

adde[o][.] 


51 

ai 

Add Immediate 

addic 

Add Immediate Carrying 

51 

ai. 

Add Immediate and Record 

addic. 

Add Immediate Carrying and Record 

53 

ame[o][.] 

Add To Minus One Extended 

addme[o][.] 


63 

andil. 

AND Immediate Lower 

andi. 

AND Immediate 

63 

andiu. 

AND Immediate Upper 

andis. 

AND Immediate Shifted 

54 

aze[o][.] 

Add To Zero Extended 

addze[o][.] 


22 

bcc[l] 

Branch Conditional to Count Register 

bcctrQ] 


22 

bcr[l] 

Branch Conditional to Link Register 

bclr[l] 


50 

cal 

Compute Address Lower 

addi 

Add Immediate 

50 

cau 

Compute Address Upper 

addis 

Add Immediate Shifted 

51 

cax[o][.] 

Compute Address 

add[o][.] 

Add 

68 

cntlz[.] 

Count Leading Zeros 

cnt!zw[.] 

Count Leading Zeros Word 

Bk II 

dclz 

Data Cache Line Set to Zero 

dcbz 

Data Cache Block set to Zero 

48 

dcs 

Data Cache Synchronize 

sync 

Synchronize 

67 

extsQ] 

Extend Sign 

extsh[.] 

Extend Sign Halfword 

106 

fa[.] 

Floating Add 

fadd[.] 


107 

fd[] 

Floating Divide 

Miv[.] 


107 

fm[.] 

Floating Multiply 

fmul[.] 


108 

fma[.] 

Floating Multiply-Add 

fmadd[.] 


108 

fms[.] 

Floating Multiply-Subtract 

fmsub[.] 


109 

fnma[.] 

Floating Negative Multiply-Add 

fnmadd[.] 


109 

fnms[.] 

Floating Negative Multiply-Subtract 

fnmsub[.] 


106 

fs[] 

Floating Subtract 

fsub[.] 


Bk II 

ics 

Instruction Cache Synchronize 

isync 

Instruction Synchronize 

33 

1 

Load 

Iwz 

Load Word and Zero 

40 

Ibrx 

Load Byte-Reverse Indexed 

Iwbrx 

Load Word Byte-Reverse Indexed 

42 

Im 

Load Multiple 

Imw 

Load Multiple Word 

44 

Isi 

Load String Immediate 

Iswi 

Load String Word Immediate 

44 

Isx 

Load String Indexed 

Iswx 

Load String Word Indexed 
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Page l 

[Mnemonic 


33 lu 
33 lux 

33 lx 
Bk III mtsri 
55 muli 
55 muls[o][.] 
64 oril 
64 oriu 

74 rlimi[.] 

71 rlinm[.] 

73 rlnm[.] 

52 sf[o][.] 

53 sfe[o][.] 

52 sfi 

53 sfme[o][.] 

54 sfee[o][.] 

75 sl[.] 

76 sr[.] 

78 sra[.] 

77 srai[.] 



Power 


Instruction 


Load with Update 
Load with Update Indexed 

Load Indexed 

Move To Segment Register Indirect 

Multiply Immediate 

Multiply Short 

OR Immediate Lower 

OR Immediate Upper 

Rotate Left Immediate Then Mask 

Insert 

Rotate Left Immediate Then AND 
With Mask 

Rotate Left Then AND With Mask 

Subtract From 

Subtract From Extended 

Subtract From Immediate 

Subtract From Minus One Extended 

Subtract From Zero Extended 

Shift Left 

Shift Right 

Shift Right Algebraic 

Shift Right Algebraic Immediate 

Store 

Store Byte-Reverse Indexed 
Store Multiple 
Store String Immediate 
Store String Indexed 
Store with Update 
Store with Update Indexed 
Store Indexed 
Supervisor Call 
Trap 

Trap Immediate 
TLB invalidate Entry 
XOR Immediate Lower 
XOR Immediate Upper 


PowerPC 


Mnemonic Instruction 


Iwzx 

mtsrin 

mulli 

mullw[o][.] 


rlwimi[.] 

rlwinm[.] 

rlwnm[.] 

subfc[o][.] 

subfe[o][.] 

subfic 

subfme[o][.] 

subfze[o][.] 

slw[.] 

srw[.] 

sraw[.] 

srawi[.] 

stw 

stwbrx 

stmw 

stswi 

stswx 

stwu 

stwux 

stwx 


Load Word and Zero with Update 
Load Word and Zero with Update 
Indexed 

Load Word and Zero Indexed 

Multiply Low Immediate 

Multiply Low Word 

OR Immediate 

OR Immediate Shifted 

Rotate Left Word Immediate then 

Mask Insert 

Rotate Left Word Immediate then 
AND with Mask 

Rotate Left Word then AND with 
Mask 

Subtract From Carrying 
Subtract From Immediate Carrying 


Shift Left Word 
Shift Right Word 
Shift Right Algebraic Word 
Shift Right Algebraic Word Imme¬ 
diate 

Store Word 

Store Word Byte-Reverse Indexed 

Store Multiple Word 

Store String Word Immediate 

Store String Word Indexed 

Store Word with Update 

Store Word with Update Indexed 

Store Word Indexed 

System Call 

Trap Word 

Trap Word Immediate 
TLB Entry Invalidate 
XOR Immediate 
XOR Immediate Shifted 
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Appendix G. Incompatibilities with the Power Architecture 


This section identifies the known incompatibilities that 
must be managed in the migration from the Power 
Architecture to the PowerPC Architecture. Some of 
the incompatibilities can, at least in principle, be 
detected by the processor, which could trap and let 
software simulate the Power operation. Others 
cannot be detected by the processor even in prin¬ 
ciple. 


In general, the incompatibilities identified here are 
those that affect a Power application program: 
incompatiblities for instructions that can be used only 
by Power system programs are not necessarily dis¬ 
cussed. 


G.1 New Instructions, Formerly 
Privileged Instructions 

Instructions new to PowerPC typically use opcode 
values (including extended opcode) that are illegal in 
Power. A few instructions that are privileged in 
Power (e.g., dc/z, called dcbz in PowerPC) have been 
made non-privileged in PowerPC. Any Power program 
that executes one of these now-valid or now-non- 
privileged instructions, expecting to cause the system 
illegal instruction error handler or the system privi¬ 
leged instruction error handler to be invoked, will not 
execute correctly on PowerPC. 

G.2 Newly Privileged 
Instructions 

The following instructions are non-privileged in Power 
but privileged in PowerPC. 

mfmsr 

mfsr 


G.3 Reserved Bits in 
Instructions 

These are shown with '/'s in the instruction layouts. 
In Power such bits are ignored by the processor. In 
PowerPC they must be 0 or the instruction form is 
invalid. 

In several cases the PowerPC Architecture assumes 
that such bits in Power instructions are indeed 0. The 
cases include the following. 

■ cmpi, cmp, cmpli, and cmpl assume that bit 10 in 
the Power instructions is 0. 

■ mtspr and mfspr assume that bits 16:20 in the 
Power instructions are 0. 


G.4 Reserved Bits in Registers 

Power defines these bits to be 0 on read, and either 0 
or 1 on write. In PowerPC it is implementation 
dependent, for each bit, whether the bit is: 

■ 0 on read and ignored on write; or 

■ copied from source to target on both read and 
write. 
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G.5 Alignment Check 


mcrfs 

dcbz (ddz in Power) 


The Power MSR AL bit (bit 24) is no longer supported: 
the bit is reserved in PowerPC. The low-order bits of 
the EA are always used. (Notice that the value 0 — 
the normal value for a reserved SPR bit — means 
“ignore the low-order EA bits” in Power, and the 
value 1 means “use the low-order EA bits.”) However, 
MSR bit 24 will not be assigned new meaning in the 
near future (see Part 3, “PowerPC Operating Environ¬ 
ment Architecture" on page 141), and software is per¬ 
mitted to write the value 1 to the bit. 


G.6 Condition Register 

The following instructions specify a field in the CR 
explicitly (via the BF field) and also, in Power, use bit 
31 as the Record bit. In PowerPC, if bit 31 = 1 for 
these instructions the instruction form is invalid. In 
Power, if Rc=1 the instructions execute normally 
except as follows. 

cmp CRO is undefined if Rc=1 and BF#0 

cmpl CRO is undefined if Rc = 1 and BF#0 

mcrxr CRO is undefined if Rc=1 and BF?fcO 

fcmpu CR1 is undefined if Rc=1 

fcmpo CR1 is undefined if Rc = 1 

mcrfs CR1 is undefined if Rc=1 and BF^fcl 

G.7 Inappropriate use of LK and 
Rc bits 

For the instructions listed below, if bit 31 (LK or Rc bit 
in Power) is set to 1, Power executes the instruction 
normally with the exception of setting the Link Reg¬ 
ister (if LK=1) or Condition Register Field 0 or 1 (if 
Rc=1) to an undefined value. In PowerPC such 
instruction forms are invalid. 

PowerPC instruction form invalid if bit 31=1 (LK bit 
in Power): 

sc (svc in Power) 

the Condition Register Logical instructions 
mcrf 

isync (ics in Power) 

PowerPC instruction form invalid if bit 31 = 1 (Rc bit 
in Power): 

fixed-point X-form Load and Store instructions 
fixed-point X-form Compare instructions 
the X-form Trap instruction 
mtspr, mfspr, mtcrf, mcrxr, mfcr 
floating-point X-form Load and Store instructions 
floating-point Compare instructions 


G.8 BO Field 

Power shows certain bits in the BO field — used by 
Branch Conditional instructions — as “x.” Although 
the Power Architecture does not say how these bits 
are to be interpreted, they are in fact ignored by the 
processor. PowerPC treats these bits differently, as 
follows. 

BO 0:3 PowerPC shows the bit as “z.” (For the 
“branch always” encoding of the BO field, B0 4 
is also shown as “z.”) If a “z” bit is not zero 
the instruction form is invalid. 

B0 4 This bit — which is shown as “x” in Power 
independent of the other four bits — is shown 
in PowerPC as “y” (except for the “branch 
always” encoding of the BO field). The “y” bit 
gives a hint about whether the branch is likely 
to be taken. If a Power program has the 
“wrong” value for this bit, the program will run 
correctly but performance may suffer. 

G.9 Branch Conditional to Count 
Register 

For the case in which the Count Register is decre¬ 
mented and tested (i.e., the case in which BO 2 = 0 ), 
Power specifies only that the branch target address is 
undefined, with the implication that the Count Reg¬ 
ister, and the Link Register if LK=1, are updated in 
the normal way. PowerPC considers this instruction 
form invalid. 


G.10 System Call 

There are several respects in which PowerPC is 
incompatible with Power for System Call instructions 
— which in Power are called Supervisor Call 
instructions. 

■ Power provides a version of the Supervisor Call 
instruction (bit 30 = 0) that allows instruction 
fetching to continue at any one of 128 locations. 
It is used for “fast SVCs.” PowerPC provides no 
such version: if bit 30 of the instruction is 0 the 
instruction is reserved. 

■ Power provides a version of the Supervisor Call 
instruction (bits 30:31 = Obi 1) that resumes 
instruction fetching at one location and sets the 
Link Register to the address of the next instruc¬ 
tion. PowerPC provides no such version: if bit 31 
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of the instruction is 1 the instruction form is 
invalid. 

■ For Power, information from the MSR is saved in 
the Count Register. For PowerPC this information 
is saved in SRR1. 

■ Power permits bits 16:29 of the instruction to be 
non-zero, while in PowerPC such an instruction 
form is invalid. 

■ Power saves the low-order 16 bits of the instruc¬ 
tion, in the Count Register. PowerPC does not 
save them. 

■ The settings of MSR bits by the associated inter¬ 
rupt differ between Power and PowerPC: see 
POWER Processor Architecture and Part 3, 
“PowerPC Operating Environment Architecture” 
on page 141. 

G.11 Fixed-Point Exception 
Register (XER) 

Bits 16:23 of the XER are reserved in PowerPC, while 
in Power they are defined and contain the comparison 
byte for the Iscbx instruction (which PowerPC lacks). 

- Engineering Note - 

For reasons of compatibility with the Power Archi¬ 
tecture, early implementations must handle XER 
bits 16:23 according to the second of the two per¬ 
mitted treatments of reserved bits in status and 
control registers. That is, early implementations 
must set the bits from the source value on write, 
and return the value last set for them on read. 


G.12 Update Forms of Storage 
Access 

PowerPC requires that RA not be equal to either RT 
(fixed-point Load only) or 0. If the restriction is vio¬ 
lated the instruction form is invalid. Power permits 
these cases, and simply avoids saving the EA. 


G.13 Multiple Register Loads 

PowerPC requires that RA, and RB if present in the 
instruction format, not be in the range of registers to 
be loaded, while Power permits this and does not 
alter RA or RB in this case. (The PowerPC restriction 
applies even if RA = 0, although there is no obvious 
benefit to the restriction in this case since RA is not 
used to compute the effective address if RA = 0.) If 


the PowerPC restriction is violated, the instruction 
form is invalid. The instructions affected are: 

Imw (Im in Power) 

Iswi (Isi in Power) 

Iswx (Isx in Power) 

Thus, for example, an Imw instruction that loads all 32 
registers is valid in Power but is an invalid form in 
PowerPC. 

G.14 Alignment for Load/Store 
Multiple 

PowerPC requires the EA to be word-aligned, and 
yields an Alignment interrupt or boundedly undefined 
results if it is not. Power specifies that an Alignment 
interrupt occurs (if AL= 1). 

- Engineering Note - 

If attempt is made to execute an Imw or s tmw 
instruction having an incorrectly aligned effective 
address, early implementations must either cor¬ 
rectly transfer the addressed bytes or cause an 
Alignment interrupt, for reasons of compatibility 
with the Power Architecture. 


G.15 Move Assist instructions 

There are several respects in which PowerPC is 
incompatible with Power for Move Assist instructions. 

■ In PowerPC an Iswx instruction with zero length 
leaves the content of RT undefined, while in 
Power the corresponding instruction (Isx) does 
not alter RT in this case. 

■ In PowerPC an Iswx instruction with zero length 
may alter the Reference bit, and an stswx 
instruction with zero length may alter the Refer¬ 
ence and Change bits, while in Power the corre¬ 
sponding instructions (Isx and stsx) do not alter 
the Reference and Change bits in this case. 


G.16 Synchronization 

The sync instruction (called dcs in Power) and the 
isync instruction (called /cs in Power) cause much 
more pervasive synchronization in PowerPC than in 
Power. 
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G.17 Move To/From SPR 


G.20 Move From FPSCR 


There are several respects in which PowerPC is 
incompatible with Power for Move To/From Special 
Purpose Register instructions. 

■ The SPR field is ten bits long in PowerPC, but only 
five in Power (see also Section G.3, “Reserved 
Bits in Instructions” on page 255). 

■ mfspr can be used to read the Decrementer in 
problem state in Power, but only in privileged 
state in PowerPC. 

■ If the SPR value specified in the instruction is not 
one of the defined values, PowerPC considers the 
instruction form invalid. (In problem state, the 
allowed SPR values exclude those accessible only 
in privileged state.) Power does not alter any 
architected registers in this case, and generates 
a Privileged Instruction type Program interrupt if 
the instruction is executed in problem state and 
SPR 0 = 1. 


G.18 Effects of Exceptions on 
FPSCR Bits FR and FI 

For the following cases. Power does not say how FR 
and FI are set, while PowerPC preserves them for 
Invalid Operation Exception caused by a Compare 
instruction, sets FI to 1 and FR to an undefined value 
for disabled Overflow Exception, and clears them oth¬ 
erwise. 

■ Invalid Operation Exception (enabled or disabled) 

■ Zero Divide Exception (enabled or disabled) 

■ Disabled Overflow Exception 

G.19 Floating-Point Store 
Instructions 

Power uses FPSCR UE to help determine whether 
denormalization should be done, while PowerPC does 
not. Using FPSCR UE is in fact incorrect: if 
FPSCR ue = 1 and a denormalized single-precision 
number is copied from one storage location to 
another by means of Ifs followed by stfs, the two 
“copies” may not be the same. 


Power defines the high-order 32 bits of the result of 
mffs to be OxFFFF_FFFF, while PowerPC says they are 
undefined. 

G.21 Zeroing Bytes in the Data 
Cache 

The dc/z instruction of Power and the dcbz instruction 
of PowerPC have the same opcode. However, the 
functions differ in the following respects. 

■ dc/z clears a line while dcbz clears a block. 

■ dc/z saves the EA in RA (if RA#0) while dcbz 
does not. 

■ dc/z is privileged while dcbz is not. 

G.22 Floating-Point Load/Store 
to Direct-Store Segment 

In Power a floating-point Load or Store instruction to a 
direct-store segment causes a Data Storage 
interrrupt, while in PowerPC the instruction either exe¬ 
cutes correctly or causes an Alignment interrupt. 

G.23 Segment Register 
Instructions 

The definitions of the four Segment Register 
instructions (mtsr, mtsrin, mfsr, and mfsrin) differ in 
two respects between Power and PowerPC. 
Instructions similar to mtsrin and mfsrin are called 
mtsri and mfsri in Power. 

privilege: mfsr and mfsri are problem state 
instructions in Power, while mfsr and 
mfsrin are privileged in PowerPC. 

function: the “indirect” instructions {mtsri and 

mfsri) in Power use an RA register in 
computing the Segment Register number, 
and the computed EA is stored into RA (if 
RA*0 and RA^fcRT), while in PowerPC 
mtsrin and mfsrin have no RA field and 
EA is not stored. 

mtsr, mtsrin (mtsri), and mfsr have the same opcodes 
in PowerPC as in Power, mfsri (Power) and mfsrin 
(PowerPC) have different opcodes. 


258 PowerPC Architecture First Edition 




G.24 TLB Entry Invalidation 

The tlbi instruction of Power and the tlbie instruction 
of PowerPC have the same opcode. However, the 
functions differ in the following respects. 

■ tlbi computes the EA as (RA|0) + (RB), while 
tlbie lacks an RA field and computes the EA as 
(RB). 

■ tlbi saves the EA in RA (if RA?M)), while tlbie 
lacks an RA field and does not save the EA. 


G.25 Floating-Point Interrupts 

Both architectures use MSR bit 20 to control the gen¬ 
eration of interrupts for floating-point enabled excep¬ 
tions. However, in PowerPC this bit is part of a 
two-bit value which controls the occurrence, precision, 
and recoverability of the interrupt, while in Power this 
bit is used independently to control the occurence of 
the interrupt (in Power all floating-point interrupts are 
precise). 


G.26 Timing Facilities 

G.26.1 Real-Time Clock 

The Power Real-Time Clock is not supported in 
PowerPC. Instead, PowerPC provides a Time Base. 
Both the RTC and the TB are 64-bit Special Purpose 
Registers, but they differ in the following respects. 

■ The RTC counts seconds and nanoseconds, while 
the TB counts “ticks.” The ticking rate of the RTC 
is implementation-dependent. 


■ The RTC increments discontinuously: 1 is added 
to RTCU when the value in RTCL passes 
999_999_999. The TB increments continuously: 1 
is added to TBU when the value in TBL passes 
OxFFFF_FFFF. 

■ The RTC is written and read by the mtspr and 
mfspr instructions, using SPR numbers that 
denote the RTCU and RTCL The TB is written by 
the mtspr instruction (using new SPR numbers), 
and read by the new mftb instruction. 

■ The SPR numbers that denote Power's RTCL and 
RTCU are invalid in PowerPC. 

■ The RTC is guaranteed to increment at least once 
in the time required to execute ten Add Imme¬ 
diate instructions. No analogous guarantee is 
made for the TB. 

■ Not all bits of RTCL need be implemented, while 
all bits of the TB must be implemented. 

G.26.2 Decrementer 

The PowerPC Decrementer differs from the Power 

Decrementer in the following respects. 

■ The PowerPC DEC decrements at the same rate 
that the TB increments, while the Power 
Decrementer decrements every nanosecond 
(which is the same rate that the RTC increments). 

■ Not all bits of the Power DEC need be imple¬ 
mented, while all bits of the PowerPC DEC must 
be implemented. 

■ The interrupt caused by the DEC has its own 
interrupt vector location in PowerPC, but is con¬ 
sidered an External interrupt in Power. 
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G.27 Deleted Instructions 


G.28 Discontinued Opcodes 


The following instructions are part of the Power Archi¬ 
tecture but have been dropped from the PowerPC 
Architecture. 

abs Absolute 

c/cs Cache Line Compute Size 

elf Cache Line Flush 

cli Cache Line Invalidate 

deist Data Cache Line Store 

div Divide 

divs Divide Short 

doz Difference Or Zero 

dozi Difference Or Zero Immediate 

Iscbx Load String And Compare Byte Indexed 

maskg Mask Generate 

maskir Mask Insert From Register 

mfsri Move From Segment Register Indirect 

mul Multiply 

nabs Negative Absolute 

rac Real Address Compute 

rlmi Rotate Left Then Mask Insert 

rrib Rotate Right And Insert Bit 

s/e Shift Left Extended 

sleq Shift Left Extended With MO 

sliq Shift Left Immediate With MO 

slliq Shift Left Long Immediate With MO 

sllq Shift Left Long With MO 

slq Shift Left With MO 

sraiq Shift Right Algebraic Immediate With MO 

sraq Shift Right Algebraic With MO 

sre Shift Right Extended 

srea Shift Right Extended Algebraic 

sreq Shift Right Extended With MO 

sriq Shift Right Immediate With MO 

srliq Shift Right Long immediate With MO 

srlq Shift Right Long With MO 

srq Shift Right With MO 

svc[/] Supervisor Call, with SA = 0 

Note: Many of these instructions use the MO reg¬ 
ister. The MO is not defined in the PowerPC Architec¬ 
ture. 


The opcodes listed below are defined in the Power 
Architecture but have been dropped from the 
PowerPC Architecture. The list contains the old mne¬ 
monic (MNEM), the primary opcode (PRI), and the 
extended opcode (XOP) if appropriate. 


MNEM 

PRI 

XOP 

abs 

31 

360 

clcs 

31 

531 

elf 

31 

118 

cli 

31 

502 

deist 

31 

630 

div 

31 

331 

divs 

31 

363 

doz 

31 

264 

dozi 

09 

— 

Iscbx 

31 

277 

maskg 

31 

29 

maskir 

31 

541 

mfsri 

31 

627 

mul 

31 

107 

nabs 

31 

488 

rac 

31 

818 

rlmi 

22 

— 

rrib 

31 

537 

sle 

31 

153 

sleq 

31 

217 

sliq 

31 

184 

slliq 

31 

248 

sllq 

31 

216 

slq 

31 

152 

sraiq 

31 

952 

sraq 

31 

920 

sre 

31 

665 

srea 

31 

921 

sreq 

31 

729 

sriq 

31 

696 

srliq 

31 

760 

sriq 

31 

728 

srq 

31 

664 

svc[/] 

17 

0 


Assembler Note 


It might be helpful to current software writers for 
the Assembler to flag the discontinued Power 
instructions. 


260 PowerPC Architecture First Edition 




Appendix H. New instructions 


The following instructions in the PowerPC Architecture 
are new: they are not in the Power Architecture. 

They are listed in three groups, according to whether 
they exist in all PowerPC implementations, only in 
64-bit implementations, or only in 32-bit implementa¬ 
tions. 

The following instructions are optional: eciwx, ecowx, 
fres, frsqrte, fee/, fegrf[s], s Ibia, slbie, stfiwx, tibia, 
tlbsync. 


H.1 New Instructions for AH 
Implementations 


debt Data Cache Block Flush 

debi Data Cache Block Invalidate 

debst Data Cache Block Store 

debt Data Cache Block Touch 

debtst Data Cache Block Touch for Store 

dhrw Divide Word 

divwu Divide Word Unsigned 

eciwx External Control In Word Indexed 

ecowx External Control Out Word Indexed 

e/e/o Enforce In-order Execution of I/O 

extsb Extend Sign Byte 

fadds Floating Add Single 

fetiw Floating Convert To Integer Word 

fetiwz Floating Convert To Integer Word with 

round toward Zero 
fdivs Floating Divide Single 

fmadds Floating Multiply-Add Single 

fmsubs Floating Multipiy-Subtract Single 

fmuls Floating Multiply Single 

fnmadds Floating Negative Multiply-Add Single 

fnmsubs Floating Negative Multipiy-Subtract Single 

fres Floating Reciprocal Estimate Single 

frsqrte Floating Reciprocal Square Root Estimate 

fee/ Floating Select 

fegrt[s] Floating Square Root [Single] 

fsubs Floating Subtract Single 

iebi Instruction Cache Block Invalidate 

Iwarx Load Word And Reserve Indexed 

mftb Move From Time Base 

mulhw Multiply High Word 

mulhwu Multiply High Word Unsigned 

stfiwx Store Floating-Point as Integer Word 

Indexed 

stwex. Store Word Conditional Indexed 

subf Subtract From 

tibia TLB Invalidate All 

tlbsync TLB Synchronize 
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H.2 New Instructions for 64-Bit H.3 New Instructions for 32-Bit 
Implementations Only Implementations Only 

cntlzd Count Leading Zeros Doubleword mfsrin Move From Segment Register Indirect 

divd Divide Doubleword 

divdu Divide Doubleword Unsigned 

extsw Extend Sign Word 

fcfid Floating Convert From Integer 

Doubleword 

fetid Floating Convert To Integer Doubleword 

fetidz Floating Convert To Integer Doubleword 

with round toward Zero 
Iwa Load Word Algebraic 

Iwaux Load Word Algebraic with Update Indexed 

Iwax Load Word Algebraic Indexed 

Id Load Doubleword 

Idarx Load Doubleword And Reserve Indexed 

Idu Load Doubleword with Update 

Idux Load Doubleword with Update Indexed 

Idx Load Doubleword Indexed 

mulhd Multiply High Doubleword 

mulhdu Multiply High Doubleword Unsigned 

mulld Multiply Low Doubleword 

rideI Rotate Left Doubleword then Clear Left 

rider Rotate Left Doubleword then Clear Right 

rldic Rotate Left Doubleword Immediate then 

Clear 

rldicl Rotate Left Doubleword Immediate then 

Clear Left 

rldicr Rotate Left Doubleword Immediate then 

Clear Right 

rldimi Rotate Left Doubleword Immediate then 

Mask Insert 

slbia SLB Invalidate All 

slbie SLB Invalidate Entry 

sld Shift Left Doubleword 

srad Shift Right Algebraic Doubleword 

sradi Shift Right Algebraic Doubleword Imme¬ 

diate 

srd Shift Right Doubleword 

std Store Doubleword 

stdex. Store Doubleword Conditional Indexed 

stdu Store Doubleword with Update 

stdux Store Doubleword with Update Indexed 

stdx Store Doubleword Indexed 

td Trap Doubleword 

tdi Trap Doubleword Immediate 
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Appendix I. Illegal Instructions 


With the exception of the instruction consisting 
entirely of binary 0's, the instructions in this class are 
available for future extensions of the PowerPC Archi¬ 
tecture: that is, some future version of the PowerPC 
Architecture may define any of these instructions to 
perform new functions. 

The following primary opcodes are illegal. 

1, 4, 5, 6, 56, 57, 60, 61 

In addition, the following primary opcodes are illegal 
for 32-bit implementations (they are defined only for 
64-bit implementations). 

2, 30, 58, 62 

The following primary opcodes have unused extended 
opcodes. Their unused extended opcodes can be 
determined from the opcode maps in — Heading 
'OPMAPS' unknown Extended opcodes for 

instructions that are defined only for 64-bit implemen¬ 
tations are illegal in 32-bit implementations, and 
extended opcodes for instructions that are defined 
only for 32-bit implementations are illegal in 64-bit 
implementations. All unused extended opcodes are 
illegal. 

17, 19, 30\ 31, 59, 62 1 , 63 

1 Applies only for 64-bit implementations (illegal 
primary opcode for 32-bit implementations) 

An instruction consisting entirely of binary 0's is 
illegal, and is guaranteed to be illegal in all future 
versions of this architecture. 
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Appendix J. Reserved Instructions 


The instructions in this class are allocated to specific 
purposes that are outside the scope of the PowerPC 
User Instruction Set Architecture, PowerPC Virtual 
Environment Architecture, and PowerPC Operating 
Environment Architecture. 

The following types of instruction are included in this 
class. 

1. The instruction having primary opcode 0, except 
the instruction consisting entirely of binary 0's 
(which is an illegal instruction: see Section 1.8.2, 
“Illegal Instruction Class” on page 13). 

2. Instructions for the Power Architecture which 
have not been included in the PowerPC Architec¬ 


ture. These are listed in Appendix G, “Incompat¬ 
ibilities with the Power Architecture" on 
page 255. 

3. Implementation-specific instructions used to 
conform to the PowerPC Architecture specifica¬ 
tions. 

4. Any other instructions contained in Book IV, 
PowerPC Implementation Features for any imple¬ 
mentation, which are not defined in the PowerPC 
User Instruction Set Architecture, PowerPC 
Virtual Environment Architecture, nor PowerPC 
Operating Environment Architecture. 
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Appendix K. Optional Facilities and Instructions 


The facilities (special purpose registers and 
instructions) described in this appendix are optional. 
An implementation may choose to provide all, some, 
or none of them. If a facility is implemented that 
matches semantics of a facility described here, the 
implementation should be as specified here. 


K.1 External Control 

The External Control facility provides a means for a 
problem state program to communicate with a special 
purpose device. Two instructions are provided: 

■ External Control Out Word Indexed (ecowx), which 
does the following: 

— Computes an Effective Address (EA) as for 
any X-form instruction 

— Validates the EA as would be done for a 
store to that address 
— Translates the EA to a Real Address 
— Transmits the Real Address and a word of 
data from a general register to the device 

■ External Control In Word Indexed (eciwx), which 
does the following: 


issue External Access instructions and when they are 
allowed to do so. 

Interpretation of the real address transmitted by 
ecowx and eciwx and the 32-bit value transmitted by 
ecowx is up to the target device. Such interpretation 
is not specified by PowerPC Architecture. See the 
System Architecture documentation for a given 
PowerPC system for details on how the External 
Control facility can be used with devices on that 
system. 

Example 

An example of a device designed to be used with the 
External Control facility might be a graphics adapter. 
The ecowx instruction might be used to send the 
device the translated real address of a buffer con¬ 
taining graphics data, and the word transmitted from 
the general register might be control information that 
tells the adapter what operation to perform on the 
data in the buffer. The eciwx instruction might be 
used to load status information from the adapter. 

K.1.1 External Access Register 


— Computes an Effective Address (EA) as for 
any X-form instruction 

— Validates the EA as would be done for a load 
from that address 

— Translates the EA to a Real Address 

— Transmits the Real Address to the device 

— Accepts a word of data from the device and 
places it in a general register 


This 32-bit Special Purpose Register controls access 
to the External Control facility and, for external 
control operations that are permitted, determines 
which device is the target. 



o 


26 31 


Depending on the setting of a control bit in a special 
purpose register, the External Access Register (EAR), 
the processor either performs the external control 
operation or it takes a Data Storage interrupt. The 
EAR controls access to the external access facility. 
Access to the EAR itself is privileged; the operating 
system can determine which tasks are allowed to 


Bit Name Description 

0 E Enable bit 

26:31 RID Resource ID 

All other fields are reserved. 

Figure 83. External Access Register 
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K.1.2 External Access Instructions 


External Control In Word Indexed 
X-form 


eciwx RT,RA,RB 


31 

RT 

RA 

RB 

310 

/ 

0 

6 

ti 

16 

21 

31 


if RA = 0 then b «- 0 
else b «- (RA) 

EA b + (RB) 
if EAR e = 1 then 

raddr «- address translation of EA 
send load request for raddr to 
device identified by EAR R(D 
RT «- 32 0 || word from device 
else 

DSISR,, «- 1 

generate Data Storage interrupt 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If EAR e = 1, a load request for the real address corre¬ 
sponding to EA sent to the device identified by 
EAR R | D , bypassing the cache. RT 0;3 , is set to 0. The 
word returned by the device is placed in RT 32;63 { 0;3 ,}. 

If EAR e = 0, a Data Storage interrupt is taken, with bit 
11 of DSISR set to 1. 

The eciwx instruction is supported for Effective 
Addresses that reference ordinary (T=0) segments 
and for EAs mapped by Data BAT registers. The 
instruction is not supported and the results are 
boundedly undefined for EAs in direct-store (T — 1) 
segments and for EAs generated when MSR dr = 0 
(real addresses). 

The access caused by this instruction is treated as a 
load from the location addressed by EA with respect 
to protection and reference and change recording. 

Special Registers Altered: 

None 


External Control Out Word Indexed 
X-form 


ecowx RS,RA,RB 


31 

RS 

RA 

RB 

438 

/ 

0 

6 

11 

16 

21 

31 


if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + (RB) 
if EAR e = 1 then 

raddr <- address translation of EA 
send store request for raddr to 
device identified by EAR R)D 
send (RS 32:6 3 {o: 3 i}) to device 
el se 

DSISR,! <- 1 

generate Data Storage interrupt 

Let the effective address (EA) be the sum 
(RA|0) + (RB). 

If EAR e = 1, a store request for the real address corre¬ 
sponding to EA and the contents of RS 32:e3 { 0;3 ,} are 
sent to the device identified by EAR RiD , bypassing the 
cache. 

If EAR e = 0, a Data Storage interrupt is taken, with bit 
11 of DSISR set to 1. 

The ecowx instruction is supported for Effective 
Addresses that reference ordinary (T=0) segments 
and for EAs mapped by Data BAT registers. The 
instruction is not supported and the results are 
boundedly undefined for EAs in direct-store (T=1) 
segments and for EAs generated when MSR dr = 0 
(real addresses). 

The access caused by this instruction is treated as a 
store to the location addressed by EA with respect to 
protection and reference and change recording. 

Special Registers Altered: 

None 
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Appendix L Synchronization Requirements for Special 
Registers 


The processor checks for input and output depend¬ 
ences with respect to all registers, and honors these 
dependences when executing a series of instructions 
involving a given register. For example, if an mtspr 
instruction writes a value to a particular SPR and an 
mfspr instruction later in the instruction stream reads 
the same SPR, the mfspr receives the value written 
by the mtspr. 

Such dependence checking does not extend to certain 
side effects of writing to status and control registers, 
SPRs, and Segment Registers, as described in the 
remainder of this appendix. 


The processor automatically provides all synchroniza¬ 
tion required for the GPRs, FPRs, CR, LR, CTR, XER, 
FPSCR, SRRO, SRR1, DAR, DSISR, SPRGO through 
SPRG3, Time Base, and Decrementer, and for the EE 
and Rl bits of the MSR, including side effects. These 
registers and MSR bits are not discussed further, in 
this appendix. 

For the remainder of this appendix, words like 
“before,” “after,” “preceding,” “following,” etc., 
when referring to instruction sequence, are with 
respect to program order. (Program order is defined 
in Part 2, “PowerPC Virtual Environment Architecture” 
on page 117.) 


L.1 Affected Registers 

Software synchronization may be required for alter¬ 
ation of the registers listed in the following sub¬ 
sections, because they affect instruction fetch and 
data access. 


L.1.1 Instruction Fetch 

Altering the content of the following registers or MSR 
bits may change the manner in which instruction 
addresses are interpreted, or the context in which 
instructions execute. 

■ ASR 

■ Segment Registers 

■ SDR1 

■ I BAT registers 

■ MSR bits: 

SF, POW, PR, FP, ME, FEO, FE1, SE, BE, IP, IR, LE 


L.1.2 Data Access 

Altering the content of the following registers or MSR 
bits may change the manner in which data accesses 
are performed, or the context in which they are per¬ 
formed. 

■ ASR 

■ Segment Registers 

■ SDR1 

■ DBAT registers 

■ EAR 

■ MSR bits: 

SF, POW, PR, DR, LE 

L.2 Context Synchronizing 
Operations 

The following instructions and events comprise the 
context synchronizing operations (see Section 9.7.1, 
“Context Synchronization” on page 145). They can 
be used to synchronize alteration of the registers 
listed above, as described below. 

■ isync 

m SC 

■ rfi 
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■ any interrupt, other than System Reset and 
Machine Check 

(As described in Chapter 13, “Interrupts” on 
page 191, System Reset and Machine Check are 
context synchronizing if they are recoverable.) 

The sync instruction, although not context- 
synchronizing, can sometimes be used to provide the 
required synchronization, as described below. 

L.3 Software Synchronization 
Requirements 

To ensure that instructions appear to execute in 
program order (i.e., with the correct semantics and in 
the correct context), software must use synchroniza¬ 
tion instructions, as described below, when altering 
any of the registers and MSR bits listed in LI, 
“Affected Registers” on page 269. 

Sometimes advantage can be taken of the fact that 
certain instructions that occur naturally in the 
program, such as the rfi at the end of an interrupt 
handler, provide the required synchronization. 

Before Alteration 

If the corresponding relocation is enabled (IR = 1 for 
Section L1.1, DR=1 for Section LI.2), a context syn¬ 
chronizing operation or sync instruction must precede 
an alteration of any of the registers listed in Section 
LI, with the exception of SDR1 and the MSR. 

If the corresponding relocation is enabled, a sync 
instruction must precede an alteration of SDR1. The 
sync forces alterations of Reference and Change bits, 
due to instructions before the alteration of SDR1, to 
be made in the correct context. 

No explicit synchronization is required before soft¬ 
ware alters the MSR, because mtmsr is execution 
synchronizing (see Section 9.7.2, “Execution 
Synchronization” on page 145). 

After Alteration 

If the corresponding relocation is enabled (IR = 1 for 
Section L1.1, DR = 1 for Section L1.2), a context syn¬ 
chronizing operation must follow an alteration of any 
of the registers listed in Section LI, with the excep¬ 
tion of the MSR. 

A context synchronizing operation must follow an 
alteration of any of the MSR bits listed in Sections 
L1.1 and LI.2, except MSRjp if software does not 
care which value of this bit is used for non- 
recoverable System Reset and Machine Check inter¬ 
rupts. 


instructions fetched and/or executed after the alter¬ 
ation but before the context synchronizing operation 
may be fetched and/or executed in either the context 
that existed before the alteration or the context estab¬ 
lished by the alteration. 

Multiple Alterations 

When several of the registers listed in Section LI are 
altered with no intervening instructions that are 
affected by the alterations, no context synchronizing 
operations or sync instructions are required between 
the alterations. 

Examples 

■ A single Segment Register is to be altered in iso¬ 
lation: 

isync 

mtsr SRn.Rx 
isync 

■ All the Segment Registers are to be reloaded 
upon task dispatch at the end of an interrupt. 

mtsr SR0,R... 
mtsr SR1,R... 

mtsr SR15,R... 
rfi 

Because this instruction sequence reloads all 
Segment Registers, it must be executed with 
MSR| R = 0 and therefore no synchronization is 
required before the Segment Registers are 
loaded. (If the Segment Register that is being 
used for instruction fetch is not to be reloaded, 
the sequence can be executed with MSR ir = 1, 
and still no such synchronization is required.) 
The rfi provides the needed synchronization after 
the Segment Registers have been loaded, and 
before subsequent instructions are fetched and 
subsequent loads and stores executed. 

L.4 Additional Software 
Requirements 

This section describes additional software require¬ 
ments with respect to instruction fetching and address 
translation. The results of failing to satisfy these 
requirements are undefined. 

MSRpow LE 

A special sequence of instructions may be 
required for changing the Power Management 
Enable and Little-Endian Mode bits; see the Book 
IV, PowerPC Implementation Features document 
for the implementation. 
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MSR ir 

MSR| R should be altered only from code that is 
mapped virtual equals real. 

ASR 

If MSR| R = 1, alteration of the ASR is permitted 
only if the instructions in storage immediately fol¬ 
lowing the mtspr that alters the ASR are identical 
in both the old and the new address space. Any 
resulting changes in storage protection or storage 
access mode are not guaranteed to take effect 
until a context synchronizing operation is exe¬ 
cuted. 

Segment Registers 

No fields in the Segment Register that is being 
used for instruction fetch should be altered, with 
the exception of the Key bits (K s and K p ). Alter¬ 
ation of the Key bits is always permitted. Any 
resulting changes in storage protection are not 
guaranteed to take effect until a context synchro¬ 
nizing operation is executed. 

SDR1 

SDR1 should be altered only when MSR, r = 0. 

I BAT registers 

No fields in the IBAT Register that is being used 
for instruction fetch should be altered, with the 
exception of the Valid (V) bit and the Key bits (K s 
and K p ). Alteration of the V bit is permitted only if 
the instructions in storage immediately following 
the mtspr that alters the IBAT register are also 
mapped by the segmented address translation 
mechanism to the same address, or if the 
instructions are duplicated in the newly mapped 
space. Alteration of the Key bits is always per¬ 
mitted. Any resulting changes in storage pro¬ 
tection or storage access mode are not 
guaranteed to take effect until a context synchro¬ 
nizing operation is executed. 

To make an IBAT register valid in a manner such 
that the IBAT register then translates the current 
instruction stream, the following sequence should 
be used if fields in both the upper and lower IBAT 
registers are being altered. 

1. The V bit in the IBAT register should be set to 
zero. 

2. The other fields in the IBAT register should be 
initialized appropriately while the V bit 
remains zero. 

3. The V bit should be set to one. 

4. A context synchronizing operation should be 
executed. 

If all altered fields are contained in either the 
upper or lower IBAT register, a single mtspr suf¬ 
fices (a synchronizing operation is not necessarily 
required). 
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Appendix M. Implementation-Specific SPRs 


This appendix lists Special Purpose Register (SPR) 
numbers assigned by the PowerPC Architecture 
Review Process for implementation-specific uses. If a 
register shown here is present in a particular imple¬ 
mentation, a detailed description will be found in Book 
IV, PowerPC Implementation Features. 

The intent of this list is to ensure that if an SPR is 
needed for a particular function on more than one 
implementation, the same SPR number will be used. 

Note that ordering of the bits shown in the table 
below matches the descriptions in Move To/From 
Special Purpose Register on pages 79 and 80. The 
two 5-bit halves of the SPR number are reversed from 
the order in which they appear in an assembled 
instruction. 


SPR 

decimal spr 5:9 spr 0;4 

Register 

name 

Privi¬ 

leged 

lEaHum 


yes 

yes 


Processor ID Register (PIR) 

This register holds a value that distinguishes this 
processor from others in a multiprocessor. 

Floating-Point Exception Cause Register 
(FPECR) 

This register identifies the reason a Floating-Point 
Exception occurred. 
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Appendix N. Interpretation of the DSISR as set by an 
Alignment Interrupt 


For most causes of Alignment interrupt, the interrupt 
handler will emulate the interrupting instruction. To 
do this, it needs the following characteristics of the 
interrupting instruction: 

Load or store 

Length (half, word, or double) 

String, multiple, or elementary 
Fixed or float 
Update or non-update 
Byte reverse or not 
Is it dcbz ? 

The PowerPC Architecture provides this information 
implicitly, by setting opcode bits in the DSISR that 
identify the interrupting instruction type. It is not nec¬ 
essary for the interrupt handler to load the inter¬ 
rupting instruction from storage. The mapping is 
unique except for a few exceptions that are discussed 
below. The near-uniqueness depends upon the fact 
that many instructions cannot cause an Alignment 
interrupt, such as the fixed- and floating-point arith¬ 
metic instructions and the byte-width loads and 
stores. 

See Section 13.5.6, “Alignment Interrupt” on 
page 196 for a description of how the opcode and 
extended opcode is mapped to a DSISR value for an 
X-, D-, or DS-form instruction that causes an Align¬ 
ment interrupt. 

The table on the next page shows the inverse 
mapping: how the DSISR bits identify the interrupting 
instruction. The following notes apply to this table. 


(1) The instructions Iwz and Iwarx give the same 
DSISR bits (all zero). But if Iwarx causes an align¬ 
ment interrupt, it is an invalid form, so it need not 
be emulated in any precise way. It is adequate 
for the Alignment interrupt handler to simply 
emulate the instruction as if it were an Iwz. It is 
important that the emulator use the address in the 
DAR, rather than computing it from RA/RB/D, 
because Iwz and Iwarx are different formats. 

If opcode 0 (“Illegal or reserved”) can cause an 
alignment interrupt, it will be indistinguishable 
from Iwarx and Iwz. 

(2) These are distinguished by DSISR bits 12:13, which 
are not shown in the table. 

The Alignment interrupt handler will not be able to 
distinguish a floating-point load or store interrupting 
because it is misaligned, or because it addresses 
direct-store. But this does not matter; in either case 
it will be emulated by doing the operation with fixed- 
point instructions. 

The interrupt handler has no need to distinguish 
between an X-form instruction and the corresponding 
D- or DS-form instruction, if one exists. Therefore two 
such instructions may report the same DSISR value 
(all 32 bits). For example, stw and stwx may both 
report either the DSISR value shown in the following 
table for stw, or that shown for stwx. 
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If DSISR 
15:21 is: 

00 

0 

0000 

00 

0 

0001 

00 

0 

0010 

00 

0 

0011 

00 

0 

0100 

00 

0 

0101 

00 

0 

0110 

00 

0 

0111 

00 

0 

1000 

00 

0 

1001 

00 

0 

1010 

00 

0 

1011 

00 

0 

1100 

00 

0 

1101 

00 

0 

1110 

00 

0 

1111 

00 

1 

0000 

00 

1 

0001 

00 

1 

0010 

00 

1 

0011 

00 

1 

0100 

00 

1 

0101 

00 

1 

0110 

00 

1 

0111 

00 

1 

1000 

00 

1 

1001 

00 

1 

1010 

00 

1 

1011 

00 

1 

1100 

00 

1 

1101 

00 

1 

1110 

00 

1 

1111 

01 

0 

0000 

01 

0 

0001 

01 

0 

0010 

01 

0 

0011 

01 

0 

0100 

01 

0 

0101 

01 

0 

0110 

01 

0 

0111 

01 

0 

1000 

01 

0 

1001 

01 

0 

1010 

01 

0 

1011 

01 

0 

1100 

01 

0 

1101 

01 

0 

1110 

01 

0 

1111 

01 

1 

0000 

01 

1 

0001 

01 

1 

0010 

01 

1 

0011 

01 

1 

0100 

01 

1 

0101 

01 

1 

0110 

01 

1 

0111 

01 

1 

1000 

01 

1 

1001 

01 

1 

1010 

01 

1 

1011 

01 

1 

1100 

01 

1 

1101 

01 

1 

1110 

01 

1 

1111 



OOOOOxxxOO 

OOOIOxxxOO 
OOlOOxxxOO 
OOllOxxxOO 
OlOOOxxxOO 
OlOIOxxxOO 
01IOOxxxOO 
omoxxxoo 
1OOOOxxxOO 
IOOIOxxxOO 
IOIOOxxxOO 
IOIIOxxxOO 
1IOOOxxxOO 
IIOIOxxxOO 
11IOOxxxOO 
nnoxxxoo 


IOOOIxxxOO 
IOOIIxxxOO 
IOIOIxxxOO 
IOIIIxxxOO 
IIOOIxxxOO 
11011xxxOO 
IIIOIxxxOO 
nnixxxoo 


or 

D/DS- 

form 

opcode: 

so the instruction is: 

xOOOOO 

Iwarx, Iwz, reserved 
(1) 

xOOOIO 

Idarx 

xOOlOO 

stw 

xOOIIO 

- 

xOIOOO 

Ihz 

xOIOlO 

lha 

xOllOO 

sth 

xOIIIO 

Imw 

xIOOOO 

Ifs 

xiooio 

Ifd 

xioioo 

stfs 

xIOIIO 

stfd 

xllOOO 

- 

xllOlO 

Id, Idu, iwa (2) 

xlllOO 

- 

xl 1110 

std, stdu (2) 

xOOOOl 

Iwzu 

xOOOII 

- 

xOOIOI 

stwu 

xOOIII 

- 

xOIOOl 

Ihzu 

xOIOII 

lhau 

xOIIOI 

sthu 

xOI111 

stmw 

xIOOOl 

Ifsu 

xIOOl1 

ifdu 

xIOIOI 

stfsu 

xioill 

stfdu 

xl1001 

- 

xl1011 

- 

xl1101 

- 

X11111 

- 


then it is 
either 

if DSISR X-form 
15:21 is: opcode: 


IOOOOxxxOI 
IOOIOxxxOI 
IOIOOxxxOI 
IOIIOxxxOI 
1IOOOxxxOI 
IIOIOxxxOI 
IIIOOxxxOI 
nnoxxxoi 


00101xxxOI 
OOlllxxxOI 
OlOOIxxxOI 
OlOllxxxOI 
OIIOIxxxOI 
omixxxoi 
lOOOIxxxOI 
lOOIIxxxOI 
IOIOIxxxOI 
IOIIIxxxOI 
IIOOIxxxOI 
IIOIIxxxOI 
IIIOIxxxOI 
nnixxxoi 


10 0 0000 
10 0 0001 
10 0 0010 
10 0 0011 
10 0 0100 
10 0 0101 
100 0110 
1000111 
10 0 1000 
10 0 1001 
10 0 1010 
100 1011 
10 0 1100 
10 0 1101 
100 1110 
100 1111 
10 1 0000 
10 1 0001 
10 1 0010 
10 1 0011 
10 1 0100 
10 1 0101 
10 1 0110 
10 1 0111 
10 1 1000 
10 1 1001 
10 1 1010 
10 1 1011 
10 1 1100 
10 1 1101 
10 1 1110 
10 1 1111 
11 0 0000 
11 0 0001 
11 0 0010 
11 0 0011 
11 0 0100 
11 0 0101 
11 0 0110 
11 0 0111 
11 0 1000 
11 0 1001 
11 0 1010 
11 01011 
11 0 1100 
11 0 1101 
11 0 1110 
11 0 1111 
11 1 0000 
11 1 0001 
11 1 0010 
11 1 0011 
11 1 0100 
11 1 0101 
11 1 0110 
11 1 0111 
11 1 1000 
11 1 1001 
11 1 1010 
11 1 1011 
11 1 1100 
11 1 1101 
11 1 1110 
11 1 1111 


OOOOOxxxlO 
OOOIOxxxlO 
OOlOOxxxlO 
OOllOxxxlO 
OlOOOxxxlO 
OlOIOxxxlO 
OIIOOxxxlO 
omoxxxio 
IOOOOxxxIO 
IOOIOxxxIO 
IOIOOxxxIO 
IOIIOxxxIO 
IIOOOxxxlO 
IIOIOxxxlO 
mooxxxio 
nnoxxxio 
00001xxxio 
OOOIIxxxlO 
OOlOIxxxlO 
00111xxxlO 
OlOOIxxxlO 
0101IxxxlO 
OIIOIxxxlO 
0111IxxxlO 
10001xxxlO 
1001IxxxlO 
10101xxxlO 
IOIIIxxxIO 
11001xxxlO 
1101IxxxlO 
moixxxio 
1111IxxxlO 
OOOOOxxxl1 
OOOIOxxxll 
OOlOOxxxll 
001lOxxxl1 
OlOOOxxxll 
OlOIOxxxll 
OIIOOxxxll 
OIIIOxxxll 
lOOOOxxxll 
lOOIOxxxll 
lOIOOxxxll 
lOIIOxxxll 
IIOOOxxxll 
IIOIOxxxll 
IIIOOxxxll 
1111Oxxxl1 
00001 xxxll 
00011xxxl1 
00101xxxll 
00111xxxll 
01001xxxll 
OlOllxxxll 
OIIOIxxxll 
Ollllxxxll 
10001xxxll 
10011xxxll 
lOIOIxxxll 
lOlllxxxll 
11001xxxll 
IIOIIxxxll 
IIIOIxxxll 
lllllxxxll 


or 

D/DS- 

form 

opcode: so the instruction is: 


stwcx. 

stdcx. 


stwbrx 


sthbrx 


Ifsux 

Ifdux 

stfsux 

stfdux 
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Appendix O. PowerPC Instruction Set Sorted by Opcode 


This appendix lists all the instructions in the PowerPC 
Architecture. A page number is shown for 

instructions that are defined in this Book (Part 1, 
“PowerPC User Instruction Set Architecture” on 
page 1), and the Book number is shown for 
instructions that are defined in other Books (Part 2, 


“PowerPC Virtual Environment Architecture” on 
page 117, and Part 3, “PowerPC Operating Environ¬ 
ment Architecture” on page 141). If an instruction is 
defined in more than one Book, the lowest-numbered 
Book is used. 


Form 

Opcode 

Mode 

Dep. 1 

Page 

/Bk 

Mnemonic 

instruction 

Primary 

Extend 

D 

2 


0 

61 

tdi 

Trap Doubleword Immediate 

D 

3 



61 

twi 

Trap Word Immediate 

D 

7 



55 

mulli 

Multiply Low Immediate 

D 

8 


SR 

52 

subfic 

Subtract From Immediate Carrying 

D 

10 



60 

cmpli 

Compare Logical Immediate 

D 

11 



59 

cmpi 

Compare Immediate 

D 

12 


SR 

51 

addic 

Add Immediate Carrying 

D 

13 


SR 

51 

addic. 

Add Immediate Carrying and Record 

D 

14 



50 

addi 

Add Immediate 

D 

15 



50 

addis 

Add immediate Shifted 

B 

16 


CT 

21 

bc[!][a] 

Branch Conditional 

SC 

17 

1 


23 

sc 

System Call 

1 

18 



21 

b[l][a] 

Branch 

XL 

19 

0 


26 

mcrf 

Move Condition Register Field 

XL 

19 

16 

CT 

22 

bclrpl 

Branch Conditional to Link Register 

XL 

19 

33 


25 

crnor 

Condition Register NOR 

XL 

19 

50 


Bk III 

rfi 

Return From Interrupt 

XL 

19 

129 


25 

crandc 

Condition Register AND with Complement 

XL 

19 

150 


Bk II 

isync 

Instruction Synchronize 

XL 

19 

193 


24 

crxor 

Condition Register XOR 

XL 

19 

225 


24 

crnand 

Condition Register NAND 

XL 

19 

257 


24 

crand 

Condition Register AND 

XL 

19 

289 


25 

creqv 

Condition Register Equivalent 

XL 

19 

417 


25 

crorc 

Condition Register OR with Complement 

XL 

19 

449 


24 

cror 

Condition Register OR 

XL 

19 

528 

CT 

22 

bcctrp] 

Branch Conditional to Count Register 

M 

20 


SR 

74 

rlwimi[.] 

Rotate Left Word Immediate then Mask Insert 

M 

21 


SR 

71 

rlwinm[\l 

Rotate Left Word Immediate then AND with Mask 

M 

23 


SR 

73 

rlwnmn 

Rotate Left Word then AND with Mask 

D 

24 



64 

ori 

OR Immediate 

D 

25 



64 

oris 

OR Immediate Shifted 

D 

26 



64 

xori 

XOR Immediate 

D 

27 



64 

xoris 

XOR Immediate Shifted 

D 

28 


SR 

63 

andi. 

AND Immediate 

D 

29 


SR 

63 

andis. 

AND Immediate Shifted 

MD 

30 

0 

(SR) 

70 

rldicl[.] 

Rotate Left Doubleword Immediate then Clear Left 

MD 

30 

1 

(SR) 

70 

rldicr[.] 

Rotate Left Doubleword Immediate then Clear Right 
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Opcode 

Primary 

Extend 

■1 

2 

3 

30 

8 

30 

9 

31 

0 

31 

4 

31 

8 

31 

9 

31 

10 

31 

11 

31 

19 

31 

20 

31 

21 

31 

23 

31 

24 

31 

26 

31 

27 

31 

28 

31 

32 

31 

40 

31 

53 

31 

54 

31 

55 

31 

58 

31 

60 

31 

68 

31 

73 

31 

75 

31 

83 

31 

84 

31 

86 

31 

87 

31 

104 

31 

119 

31 

124 

31 

136 

31 

138 

31 

144 

31 

146 

31 

149 

31 

150 

31 

151 

31 

181 

31 

183 

31 

200 

31 

202 

31 

210 

31 

214 

31 

215 

31 

232 

31 

233 

31 

234 

31 

235 

31 

242 

31 

246 

31 

247 

31 

266 

31 

278 


Mode Page 


Mnemonic 




tw 

subfc[o][.] 

mulhdu[.] 

addc[o][.] 

mulhwu[.] 

mfcr 

Iwarx 

Idx 

Iwzx 

slw[.] 

cntlzw[.] 

sid[.] 

and[.] 

cmpl 

subf[o][.] 

Idux 

dcbst 

Iwzux 

cntlzd[.] 

andc[.] 

td 

mulhd[.] 

mulhw[.] 

mfmsr 

idarx 

dcbf 

ibzx 

neg[o][.] 

Ibzux 

nor[.] 

subfe[o][.] 

adde[o][.] 

mtcrf 

mtmsr 

stdx 

stwcx. 

stwx 

stdux 

stwux 

subfze[o][.] 

addze[o][.] 

mtsr 

stdcx. 

stbx 

subfme[o][.] 

mulld[o][.] 

addme[o][.] 

mullw[o][.] 

mtsrin 

dcbtst 

stbux 

add[o][.] 

debt 


Rotate Left Doubleword Immediate then Clear 
Rotate Left Doubleword Immediate then Mask Insert 
Rotate Left Doubleword then Clear Left 
Rotate Left Doubleword then Clear Right 
Compare 
Trap Word 

Subtract From Carrying 
Multiply High Doubleword Unsigned 
Add Carrying 

Multiply High Word Unsigned 
Move From Condition Register 
Load Word And Reserve Indexed 
Load Doubleword Indexed 
Load Word and Zero Indexed 
Shift Left Word 
Count Leading Zeros Word 
Shift Left Doubleword 
AND 

Compare Logical 
Subtract From 

Load Doubleword with Update Indexed 

Data Cache Block Store 

Load Word and Zero with Update Indexed 

Count Leading Zeros Doubleword 

AND with Complement 

Trap Doubleword 

Multiply High Doubleword 

Multiply High Word 

Move From Machine State Register 

Load Doubleword And Reserve Indexed 

Data Cache Block Flush 

Load Byte and Zero Indexed 

Negate 

Load Byte and Zero with Update Indexed 
NOR 

Subtract From Extended 
Add Extended 

Move To Condition Register Fields 

Move To Machine State Register 

Store Doubleword Indexed 

Store Word Conditional Indexed 

Store Word Indexed 

Store Doubleword Indexed with Update 

Store Word with Update indexed 

Subtract From Zero Extended 

Add to Zero Extended 

Move To Segment Register 

Store Doubleword Conditional Indexed 

Store Byte Indexed 

Subtract From Minus One Extended 

Multiply Low Doubleword 

Add to Minus One Extended 

Multiply Low Word 

Move To Segment Register Indirect 

Data Cache Block Touch for Store 

Store Byte with Update indexed 

Add 

Data Cache Block Touch 
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Opcode 

Mode 

Page 

Primary 

Extend 

Dep. 1 

/ Bk 

31 

279 


31 

31 

284 

SR 

66 

31 

306 


Bk III 

31 

310 


Bk III 

31 

311 


31 

31 

316 

SR 

65 

31 

339 


80 

31 

341 

0 

34 

31 

343 


32 

31 

370 


Bk III 

31 

371 


Bk II 

31 

373 

0 

34 

31 

375 


32 

31 

407 


37 

31 

412 

SR 

66 

31 

413 

(SR) 

77 

31 

434 

0 

Bk III 

31 

438 


Bk III 

31 

439 


37 

31 

444 

SR 

65 

31 

457 

(SR) 

58 

31 

459 

SR 

58 

31 

467 


79 

31 

470 


181 

31 

476 

SR 

65 

31 

489 

(SR) 

57 

31 

491 

SR 

57 

31 

498 

0 

Bk III 

31 

512 


80 

31 

533 


44 

31 

534 


40 

31 

535 


100 

31 

536 

SR 

76 

31 

539 

(SR) 

76 

31 

566 


Bk III 

31 

567 


100 

31 

595 

O 

Bk III 

31 

597 


44 

31 

598 


48 

31 

599 


101 

31 

631 


101 

31 

659 

{> 

Bk III 

31 

661 


45 

31 

662 


41 

31 

663 


103 

31 

695 


103 

31 

725 


45 

31 

727 


104 

31 

759 


104 

31 

790 


40 

31 

792 

SR 

78 

31 

794 

(SR) 

78 

31 

824 

SR 

77 

31 

854 


Bk II 

31 

918 


41 

31 

922 

SR 

67 

31 

954 

SR 

67 

31 

982 


132 




Ihzx 

eqv[.] 

tlbie 

eciwx 

Ihzux 

xor[.] 

mfspr 

Iwax 

lhax 

tibia 

mftb 

Iwaux 

lhaux 

sthx 

orc[.] 

sradi[.] 

slbie 

ecowx 

sthux 

or[.] 

divdu[o][.] 

divwu[o][.] 

mtspr 

dcbi 

nand[.] 

divd[o][.] 

divw[o][.] 

slbia 

mcrxr 

Iswx 

Iwbrx 

Ifsx 

srw[.] 

srd[.] 

tlbsync 

Ifsux 

mfsr 

Iswi 

sync 

lfdx 

Ifdux 


stwbrx 

stfsx 

stfsux 

stswi 



srawi[.] 

eieio 

sthbrx 

extsh[.] 

extsb[.] 

icbi 


Load Halfword and Zero Indexed 
Equivalent 

TLB Invalidate Entry 

External Control In Word Indexed 

Load Halfword and Zero with Update Indexed 

XOR 

Move From Special Purpose Register 

Load Word Algebraic Indexed 

Load Halfword Algebraic Indexed 

TLB Invalidate All 

Move From Time Base 

Load Word Algebraic with Update Indexed 

Load Halfword Algebraic with Update Indexed 

Store Halfword Indexed 

OR with Complement 

Shift Right Algebraic Doubleword Immediate 

SLB Invalidate Entry 

External Control Out Word Indexed 

Store Halfword with Update Indexed 

OR 

Divide Doubleword Unsigned 

Divide Word Unsigned 

Move To Special Purpose Register 

Data Cache Block Invalidate 

NAND 

Divide Doubleword 
Divide Word 
SLB Invalidate All 

Move to Condition Register from XER 

Load String Word Indexed 

Load Word Byte-Reverse Indexed 

Load Floating-Point Single Indexed 

Shift Right Word 

Shift Right Doubleword 

TLB Synchronize 

Load Floating-Point Single with Update Indexed 
Move From Segment Register 
Load String Word Immediate 
Synchronize 

Load Floating-Point Double Indexed 

Load Floating-Point Double with Update Indexed 

Move From Segment Register Indirect 

Store String Word Indexed 

Store Word Byte-Reverse Indexed 

Store Floating-Point Single Indexed 

Store Floating-Point Single with Update Indexed 

Store String Word Immediate 

Store Floating-Point Double Indexed 

Store Floating-Point Double with Update Indexed 

Load Halfword Byte-Reverse Indexed 

Shift Right Algebraic Word 

Shift Right Algebraic Doubleword 

Shift Right Algebraic Word Immediate 

Enforce In-order Execution of I/O 

Store Halfword Byte-Reverse Indexed 

Extend Sign Halfword 

Extend Sign Byte 

Instruction Cache Block Invalidate 
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Opcode 

Mode 

Page 

Mnemonic 

Form 

Primary 

Extend 

Dep. 1 

/ Bk 


X 

31 

983 


208 

stfiwx 

X 

31 

986 

(SR) 

67 

extsw[.] 

X 

31 

1014 


134 

dcbz 

D 

32 



33 

Iwz 

D 

33 



33 

Iwzu 

D 

34 



30 

Ibz 

D 

35 



30 

Ibzu 

D 

36 



38 

stw 

D 

37 



38 

stwu 

D 

38 



36 

stb 

D 

39 



36 

stbu 

D 

40 



31 

Ihz 

D 

41 



31 

Ihzu 

D 

42 



32 

lha 

D 

43 



32 

lhau 

D 

44 



37 

sth 

D 

45 



37 

sthu 

D 

46 



42 

Imw 

D 

47 



42 

stmw 

D 

48 



100 

Ifs 

D 

49 



100 

Ifsu 

D 

50 



101 

Ifd 

D 

51 



101 

Ifdu 

D 

52 



103 

stfs 

D 

53 



103 

stfsu 

D 

54 



104 

stfd 

D 

55 



104 

stfdu 

DS 

58 

0 

0 

35 

Id 

DS 

58 

1 

0 

35 

Idu 

DS 

58 

2 

0 

34 

Iwa 

A 

59 

18 


107 

fdivs[.] 

A 

59 

20 


106 

fsubs[.] 

A 

59 

21 


106 

fadds[.] 

A 

59 

22 


209 

fsqrts[.] 

A 

59 

24 


209 

fres[.] 

A 

59 

25 


107 

fmuls[.] 

A 

59 

28 


108 

fmsubs[.] 

A 

59 

29 


108 

fmadds[.] 

A 

59 

30 


109 

fnmsubs[.] 

A 

59 

31 


109 

fnmadds[.] 

DS 

62 

0 

0 

39 

std 

DS 

62 

1 

0 

39 

stdu 

X 

| 63 

0 


113 

fcmpu 

X 

63 

12 


110 

frsp[.] 

X 

63 

14 


111 

fctiw[.] 

X 

63 

15 


112 

fctiwz[.] 

A 

63 

18 


107 

fdiv[.] 

A 

63 

20 


106 

fsub[.] 

A 

63 

21 


106 

fadd[.] 

A 

63 

22 


209 

fsqrt[.] 

A 

63 

23 


210 

fsel[.] 

A 

63 

25 


107 

fmui[.] 

A 

63 

26 


210 

frsqrte[.] 

A 

63 

28 


108 

fmsub[.] 

A 

63 

29 


108 

fmadd[.] 

A 

63 

30 


109 

fnmsub[.] 

A 

63 

31 


109 

fnmadd[.] 

X 

63 

32 


113 

fcmpo 



Store Floating-Point as Integer Word Indexed 

Extend Sign Word 

Data Cache Block set to Zero 

Load Word and Zero 

Load Word and Zero with Update 

Load Byte and Zero 

Load Byte and Zero with Update 

Store Word 

Store Word with Update 
Store Byte 

Store Byte with Update 

Load Halfword and Zero 

Load Halfword and Zero with Update 

Load Halfword Algebraic 

Load Halfword Algebraic with Update 

Store Halfword 

Store Halfword with Update 

Load Multiple Word 

Store Multiple Word 

Load Floating-Point Single 

Load Floating-Point Single with Update 

Load Floating-Point Double 

Load Floating-Point Double with Update 

Store Floating-Point Single 

Store Floating-Point Single with Update 

Store Floating-Point Double 

Store Floating-Point Double with Update 

Load Doubleword 

Load Doubleword with Update 

Load Word Algebraic 

Floating Divide Single 

Floating Subtract Single 

Floating Add Single 

Floating Square Root Single 

Floating Reciprocal Estimate Single 

Floating Multiply Single 

Floating Multiply-Subtract Single 

Floating Multiply-Add Single 

Floating Negative Multiply-Subtract Single 

Floating Negative Multiply-Add Single 

Store Doubleword 

Store Doubleword with Update 

Floating Compare Unordered 

Floating Round to Single-Precision 

Floating Convert To Integer Word 

Floating Convert To Integer Word with round toward Zero 

Floating Divide 

Floating Subtract 

Floating Add 

Floating Square Root 

Floating Select 

Floating Multiply 

Floating Reciprocal Square Root Estimate 
Floating Multiply-Subtract 
Floating Multiply-Add 
Floating Negative Multiply-Subtract 
Floating Negative Multiply-Add 
Floating Compare Ordered 
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Form 

Opcode 

Mode 

Page 

Mnemonic 

Instruction 

Primary 

Extend 

Dep. 1 

/ Bk 

X 

63 

38 


116 

mtfsbl [.] 

Move To FPSCR Bit 1 

X 

63 

40 


105 

fneg[.] 

Floating Negate 

X 

63 

64 


114 

mcrfs 

Move to Condition Register from FPSCR 

X 

63 

70 


116 

mtfsfc>0[.] 

Move To FPSCR Bit 0 

X 

63 

72 


105 

fmr[.] 

Floating Move Register 

X 

63 

134 


115 

mtfsfi[.] 

Move To FPSCR Field Immediate 

X 

63 

136 


105 

fnabs[.] 

Floating Negative Absolute Value 

X 

63 

264 


105 

fabsf.l 

Floating Absolute Value 

X 

63 

583 


114 

mffs[J 

Move From FPSCR 

XFL 

63 

711 


115 

mtfsf[.] 

Move To FPSCR Fields 

X 

63 

814 

0 

110 

fctid[.] 

Floating Convert To Integer Doubleword 

X 

63 

815 

0 

111 

fctidz[.] 

Floating Convert To Integer Doubleword with round 
toward Zero 

X 

63 

846 

0 

112 

fcfid[.] 

Floating Convert From Integer Doubleword 


^ee key to mode dependency column, on page 287. 
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Appendix P. PowerPC Instruction Set Sorted by Mnemonic 


This appendix lists all the instructions in the PowerPC 
Architecture. A page number is shown for 

instructions that are defined in this Book (Part 1, 
“PowerPC User Instruction Set Architecture” on 
page 1), and the Book number is shown for 
instructions that are defined in other Books (Part 2, 


Form 

Opcode 

Mode 

Dep. 1 

Page 
/ Bk 

Mnemonic 

Instruction 

Primary 

Extend 

XO 

31 

266 

SR 

51 

add[o][.] 

Add 

XO 

31 

10 

SR 

52 

addc[o][.] 

Add Carrying 

XO 

31 

138 

SR 

53 

addefo][.] 

Add Extended 

D 

14 



50 

addi 

Add Immediate 

D 

12 


SR 

51 

addic 

Add Immediate Carrying 

D 

13 


SR 

51 

addic. 

Add Immediate Carrying and Record 

D 

15 



50 

addis 

Add Immediate Shifted 

XO 

31 

234 

SR 

53 

addmeroiri 

Add to Minus One Extended 

XO 

31 

202 

SR 

54 

addzefoin 

Add to Zero Extended 

X 

31 

28 

SR 

65 

and[.] 

AND 

X 

31 

60 

SR 

66 

andc[.] 

AND with Complement 

D 

28 


SR 

63 

andi. 

AND Immediate 

D 

29 


SR 

63 

andis. 

AND Immediate Shifted 

1 

18 



21 

b[l][a] 

Branch 

B 

16 


CT 

21 

bc[l][a] 

Branch Conditional 

XL 

19 

528 

CT 

22 

bcctr[l] 

Branch Conditional to Count Register 

XL 

19 

16 

CT 

22 

bclr[l] 

Branch Conditional to Link Register 

X 

31 

0 


59 

cmp 

Compare 

D 

11 



59 

cmpi 

Compare Immediate 

X 

31 

32 


60 

cmpi 

Compare Logical 

D 

10 



60 

cmpli 

Compare Logical Immediate 

X 

31 

58 

(SR) 

68 

cntlzd[.] 

Count Leading Zeros Doubleword 

X 

31 

26 

SR 

68 

cntlzw[.] 

Count Leading Zeros Word 

XL 

19 

257 


24 

crand 

Condition Register AND 

XL 

19 

129 


25 

crandc 

Condition Register AND with Complement 

XL 

19 

289 


25 

creqv 

Condition Register Equivalent 

XL 

19 

225 


24 

crnand 

Condition Register NAND 

XL 

19 

33 


25 

crnor 

Condition Register NOR 

XL 

19 

449 


24 

cror 

Condition Register OR 

XL 

19 

417 


25 

crorc 

Condition Register OR with Complement 

XL 

19 

193 


24 

crxor 

Condition Register XOR 

X 

31 

86 


135 

dcbf 

Data Cache Block Flush 

X 

31 

470 


181 

dcbi 

Data Cache Block Invalidate 

X 

31 

54 


134 

dcbst 

Data Cache Block Store 

X 

31 

278 


133 

debt 

Data Cache Block Touch 

X 

31 

246 


133 

debtst 

Data Cache Block Touch for Store 

X 

31 

1014 


134 

debz 

Data Cache Block set to Zero 


“PowerPC Virtual Environment Architecture” on 
page 117, and Part 3, “PowerPC Operating Environ¬ 
ment Architecture” on page 141). If an instruction is 
defined in more than one Book, the lowest-numbered 
Book is used. 
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Opcode 

Mode 

Page 

Primary 

Extend 

Dep. 1 

/ Bk 

31 

489 

(SR) 

57 

31 

457 

(SR) 

58 

31 

491 

SR 

57 

31 

459 

SR 

58 

31 

310 


Bk III 

31 

438 


Bk III 

31 

854 


Bk II 

31 

284 

SR 

66 

31 

954 

SR 

67 

31 

922 

SR 

67 

31 

986 

(SR) 

67 

63 

264 


105 

63 

21 


106 

59 

21 


106 

63 

846 

0 

112 

63 

32 


113 

63 

0 


113 

63 

814 

0 

110 

63 

815 

0 

111 

63 

14 


111 

63 

15 


112 

63 

18 


107 

59 

18 


107 

63 

29 


108 

59 

29 


108 

63 

72 


105 

63 

28 


108 

59 

28 


108 

63 

25 


107 

59 

25 


107 

63 

136 


105 

63 

40 


105 

63 

31 


109 

59 

31 


109 

63 

30 


109 

59 

30 


109 

59 

24 


209 

63 

12 


110 

63 

26 


210 

63 

23 


210 

63 

22 


209 

59 

22 


209 

63 

20 


106 

59 

20 


106 

31 

982 


132 

19 

150 


Bk II 

34 



30 

35 



30 

31 

119 


30 

31 

87 


30 

58 

0 

0 

35 

31 

84 

0 

46 

58 

1 

0 

35 

31 

53 

0 

35 

31 

21 

0 

35 

50 



101 

51 



101 




Mnemonic 


divd[o][.] Divide Doubleword 

divdu[o][.] Divide Doubleword Unsigned 

divw[o][.] Divide Word 

divwu[o][.] Divide Word Unsigned 

eciwx External Control In Word Indexed 

ecowx External Control Out Word Indexed 

eieio Enforce In-order Execution of I/O 

eqv[.] Equivalent 

Extend Sign Byte 
Extend Sign Halfword 
Extend Sign Word 
Floating Absolute Value 
fadd[.] Floating Add 

fadds[.] Floating Add Single 

fcfid[.] Floating Convert From Integer Doubleword 

fcmpo Floating Compare Ordered 

fcmpu Floating Compare Unordered 

fctid[.] Floating Convert To Integer Doubleword 

fctidz[.] Floating Convert To Integer Doubleword with round 

toward Zero 

fctiw[.] Floating Convert To Integer Word 

fctiwz[.] Floating Convert To Integer Word with round toward Zero 

fdiv[.] Floating Divide 

fdivs[.] Floating Divide Single 

fmadd[.] Floating Multiply-Add 

fmadds[.] Floating Multiply-Add Single 

fmr[.] Floating Move Register 

fmsub[.] Floating Multiply-Subtract 

fmsubs[.] Floating Multiply-Subtract Single 

fmul[.] Floating Multiply 

fmuls[.] Floating Multiply Single 

fnabs[.] Floating Negative Absolute Value 

fneg[.] Floating Negate 

fnmaddf.] Floating Negative Multiply-Add 

fnmadds[.] Floating Negative Multiply-Add Single 

fnmsub[.] Floating Negative Multiply-Subtract 

fnmsubs[.] Floating Negative Multiply-Subtract Single 

fres[.] Floating Reciprocal Estimate Single 

frsp[.] Floating Round to Single-Precision 

frsqrte[.] Floating Reciprocal Square Root Estimate 

fsel[.] Floating Select 

fsqrt[.] Floating Square Root 

fsqrts[.] Floating Square Root Single 

fsub[.] Floating Subtract 

fsubs[.] Floating Subtract Single 

icbi Instruction Cache Block Invalidate 

isync Instruction Synchronize 

Ibz Load Byte and Zero 

Ibzu Load Byte and Zero with Update 

Ibzux Load Byte and Zero with Update Indexed 

Ibzx Load Byte and Zero Indexed 

Id Load Doubleword 

Idarx Load Doubleword And Reserve Indexed 

Idu Load Doubleword with Update 

Idux Load Doubleword with Update Indexed 

Idx Load Doubleword Indexed 

Ifd Load Floating-Point Double 

Ifdu Load Floating-Point Double with Update 
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Opcode 

Primary 

Extend 

■fl 

631 

599 

48 


49 


31 

567 

31 

535 

42 


43 


31 

375 

31 

343 

31 

790 

40 


41 


31 

311 

31 

279 

46 


31 

597 

31 

533 

58 

2 

31 

20 

31 

373 

31 

341 

31 

534 

32 


33 


31 

55 

31 

23 

19 

0 

63 

64 1 

31 

512 

31 

19 

63 

583 

31 

83 

31 

339 

31 

595 

31 

659 

31 

371 

31 

144 

63 

70 

63 

38 

63 

711 

63 

134 

31 

146 

31 

467 

31 

210 

31 

242 

31 

73 

31 

9 

31 

75 

31 

11 

31 

233 

7 


31 

235 

31 

476 

31 

104 

31 

124 

31 

444 

31 

412 




Ifsux 

Ifsx 

lha 

lhau 

lhaux 

lhax 

Ihbrx 

Ihz 

Ihzu 

Ihzux 

Ihzx 

Imw 

Iswi 

Iswx 

Iwa 

Iwarx 

Iwaux 

Iwax 

Iwbrx 

Iwz 

Iwzu 

Iwzux 

Iwzx 

mcrf 

mcrfs 

mcrxr 

mfcr 

mffs[.] 

mfmsr 

mfspr 

mfsr 

mfsrin 

mftb 

mtcrf 

mtfsbO[.] 

mtfsbl [.] 

mtfsf[.] 

mtmsr 

mtspr 

mtsr 

mtsrin 

mulhd[.] 

mulhdu[.] 

mulhw[.] 

mulhwu[.] 

mulld[o][.] 

mulli 

mullw[o][.] 

nand[.] 

neg[o][.] 

nor[.] 

or[] 

orc[.] 


Load Floating-Point Double with Update Indexed 

Load Floating-Point Double Indexed 

Load Floating-Point Single 

Load Floating-Point Single with Update 

Load Floating-Point Single with Update Indexed 

Load Floating-Point Single Indexed 

Load Halfword Algebraic 

Load Halfword Algebraic with Update 

Load Halfword Algebraic with Update Indexed 

Load Halfword Algebraic Indexed 

Load Halfword Byte-Reverse Indexed 

Load Halfword and Zero 

Load Halfword and Zero with Update 

Load Halfword and Zero with Update Indexed 

Load Halfword and Zero Indexed 

Load Multiple Word 

Load String Word Immediate 

Load String Word Indexed 

Load Word Algebraic 

Load Word And Reserve Indexed 

Load Word Algebraic with Update Indexed 

Load Word Algebraic Indexed 

Load Word Byte-Reverse Indexed 

Load Word and Zero 

Load Word and Zero with Update 

Load Word and Zero with Update Indexed 

Load Word and Zero Indexed 

Move Condition Register Field 

Move to Condition Register from FPSCR 

Move to Condition Register from XER 

Move From Condition Register 

Move From FPSCR 

Move From Machine State Register 

Move From Special Purpose Register 

Move From Segment Register 

Move From Segment Register Indirect 

Move From Time Base 

Move To Condition Register Fields 

Move To FPSCR Bit 0 

Move To FPSCR Bit 1 

Move To FPSCR Fields 

Move To FPSCR Field Immediate 

Move To Machine State Register 

Move To Special Purpose Register 

Move To Segment Register 

Move To Segment Register Indirect 

Multiply High Doubleword 

Multiply High Doubleword Unsigned 

Multiply High Word 

Multiply High Word Unsigned 

Multiply Low Doubleword 

Multiply Low Immediate 

Multiply Low Word 

NAND 

Negate 

NOR 

OR 

OR with Complement 
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Opcode 

Mode 

Page 

Mnemonic 

Primary 

Extend 

Dep. 1 

/Bk 

24 



64 

ori 

25 



64 

oris 

19 

50 


Bk III 

rfi 

30 

8 

(SR) 

72 

rldcl[.] 

30 

9 

(SR) 

73 

rldcr[.] 

30 

2 

(SR) 

71 

rldic[.] 

30 

0 

(SR) 

70 

rldicl [.] 

30 

1 

(SR) 

70 

rldicr[.] 

30 

3 

(SR) 

74 

rldimi[.] 

20 


SR 

74 

rlwimi[.] 

21 


SR 

71 

rlwinm[.] 

23 


SR 

73 

rlwnm[.] 

17 

1 


23 

sc 

31 

498 

0 

Bk III 

slbia 

31 

434 

0 

Bk III 

slbie 

31 

27 

(SR) 

75 

sld[.] 

31 

24 

SR 

75 

slw[.] 

31 

794 

(SR) 

78 

srad[.] 

31 

413 

(SR) 

77 

sradi[.] 

31 

792 

SR 

1 78 

sraw[.] 

31 

824 

SR 

77 

srawi[.] 

31 

539 

(SR) 

76 

srd[-3 

31 

536 

SR 

76 

srw[.] 

38 



36 

stb 

39 



36 

stbu 

31 

247 


36 

stbux 

31 

215 


36 

stbx 

62 

0 

0 

39 

std 

31 

214 

0 

47 

stdcx. 

62 

1 

0 

39 

stdu 

31 

181 

0 

39 

stdux 

31 

149 

0 

39 

stdx 

54 



104 

stfd 

55 



104 

stfdu 

31 

759 


104 

stfdux 

31 

727 


104 

stfdx 

31 

983 


208 

stfiwx 

52 



103 

stfs 

53 



103 

stfsu 

31 

695 


103 

stfsux 

31 

663 


103 

stfsx 

44 



37 

sth 

31 

918 


41 

sthbrx 

45 



37 

sthu 

31 

439 


37 

sthux 

31 

407 


37 

sthx 

47 



42 

stmw 

31 

725 


45 

stswi 

31 

661 


45 

stswx 

36 



38 

stw 

31 

662 


41 

stwbrx 

31 

150 


47 

stwcx. 

37 



38 

stwu 

31 

183 


38 

stwux 

31 

151 


38 

stwx 

31 

40 

SR 

51 

subf[o][.] 

31 

8 

SR 

52 

subfc[o][.] 

31 

136 

SR 

53 

subfe[o][.] 



OR Immediate 

OR Immediate Shifted 

Return From Interrupt 

Rotate Left Doubleword then Clear Left 

Rotate Left Doubleword then Clear Right 

Rotate Left Doubleword Immediate then Clear 

Rotate Left Doubleword Immediate then Clear Left 

Rotate Left Doubleword Immediate then Clear Right 

Rotate Left Doubleword Immediate then Mask Insert 

Rotate Left Word Immediate then Mask Insert 

Rotate Left Word Immediate then AND with Mask 

Rotate Left Word then AND with Mask 

System Call 

SLB Invalidate All 

SLB Invalidate Entry 

Shift Left Doubleword 

Shift Left Word 

Shift Right Algebraic Doubleword 

Shift Right Algebraic Doubleword Immediate 

Shift Right Algebraic Word 

Shift Right Algebraic Word Immediate 

Shift Right Doubleword 

Shift Right Word 

Store Byte 

Store Byte with Update 
Store Byte with Update Indexed 
Store Byte Indexed 
Store Doubleword 

Store Doubleword Conditional Indexed 

Store Doubleword with Update 

Store Doubleword Indexed with Update 

Store Doubleword Indexed 

Store Floating-Point Double 

Store Floating-Point Double with Update 

Store Floating-Point Double with Update Indexed 

Store Floating-Point Double Indexed 

Store Floating-Point as Integer Word Indexed 

Store Floating-Point Single 

Store Floating-Point Single with Update 

Store Floating-Point Single with Update Indexed 

Store Floating-Point Single Indexed 

Store Halfword 

Store Halfword Byte-Reverse Indexed 
Store Halfword with Update 
Store Halfword with Update Indexed 
Store Halfword Indexed 
Store Multiple Word 
Store String Word Immediate 
Store String Word indexed 
Store Word 

Store Word Byte-Reverse Indexed 
Store Word Conditional Indexed 
Store Word with Update 
Store Word with Update Indexed 
Store Word Indexed 
Subtract From 
Subtract From Carrying 
Subtract From Extended 
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Form 

Opcode 

Mode 

Dep. 1 

Page 
/ Bk 


Instruction 

Primary 

Extend 

D 

8 


SR 

52 

subfic 

Subtract From Immediate Carrying 

XO 

31 

232 

SR 

53 

subfmefoiri 

Subtract From Minus One Extended 

XO 

31 

200 

SR 

54 

subfze[o][.] 

Subtract From Zero Extended 

X 

31 

598 


48 

sync 

Synchronize 

X 

31 

68 

0 

62 

td 

Trap Doubleword 

D 

2 


0 

61 

tdi 

Trap Doubleword Immediate 

X 

31 

370 


Bk III 

tibia 

TLB Invalidate All 

X 

31 

306 


Bk III 

tlbie 

TLB Invalidate Entry 


31 

566 


Bk III 

tlbsync 

TLB Synchronize 


31 

4 


62 

tw 

Trap Word 


3 



61 

twi 

Trap Word Immediate 

X 

31 

316 

SR 

65 

xor[.] 

XOR 

mm 

26 



64 

xori 

XOR Immediate 

19 

27 



64 

xoris 

XOR Immediate Shifted 


^ey to Mode Dependency Column 

The entry is shown in parentheses () if the instruction is defined only for 64-bit implementations. 


The entry is shown in braces O »f the instruction is defined only for 32-bit implementations. 


blank The instruction has no mode dependence, 
except that if the instruction refers to storage 
when in 32-bit mode, only the low-order 32 
bits of the 64-bit effective address are used 
to address storage. Storage reference 
instructions include loads, stores, branch 
instructions, etc. 


CT If the instruction tests the Count Register, it 
tests the low-order 32 bits when in 32-bit 
mode, and all 64 bits when in 64-bit mode. 

SR The instruction's primary function is mode- 
independent, but the setting of status regis¬ 
ters (such as XER and CRO) is 
mode-dependent. 
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Index 


Numerics 


32-bit mode 163 

0 

A-form 10 
AA field 10 
address 14 
effective 15 
real 158 

address translation 178 
32-bit mode 163 
64-bit mode 160 
BAT 174, 178 
block 159 

EA to VA 160, 161, 163, 168, 169 

esid to vsid 160, 161, 163, 168, 169 

overview 159, 168 

Page Table Entry 165, 171, 178 

PTE 165, 171 

Reference bit 178 

RPN 164, 170 

Segment Table Entry 162 

STE 162 

VAtoRA 160,164,168,170 
VPN 164, 170 
aliasing 125 
alignment 

effect on performance 129 
Alignment interrupt 196 
DSISR 275 
Architecture 
intent 270 
ASR 161 

assembler language 

extended mnemonics 221 
mnemonics 221 
symbols 221 
atomic operation 126 
atomicity 

single-copy 120 


111 

B-form 8 
BA field 10 
BAT 159, 174 
BB field 10 
BD field 11 
BE 148 
BF field 11 
BFA field 11 
Bl field 11 
Big-Endian 233 
block (def) 119 

block address translation 159,174 
BO field 11 

boundedly undefined 12 
Branch Trace 199 
BT field 11 
byte ordering 233 
bytes 4 

0 
C 85 
CA 28 

cache management instructions 132 
cache model 122 
cache parameters 131 
Caching Inhibited 155, 177 
Change bit 178, 181, 186, 270 
CIA 6 

Coherence, Memory 177 
combined cache 124 
combining 

accesses 177 
stores 177 
context (def) 143 
context synchronization 145 
CR 17 
CTR 18 

0 
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D field 11 

FLM field 11 

D-form 9 

floating-point 

DAR 151, 195, 197 

denormalization 88 

data 

double-precision 89 

access 

exceptions 83, 90 

synchronization 269 

inexact 95 

data cache instructions 133 

invalid operation 92 

Data Storage interrupt 194 

overflow 94 

dcbf 135 

underflow 94 

dcbi 181 

zero divide 94 

dcbst 134 

execution models 95 

debt 133 

normalization 88 

debtst 133 

number 

debz 134 

denormalized 87 

DEC 204 

infinity 87 

Decrementer interrupt 198 

normalized 87 

defined instructions 12 

not a number 88 

delayed Machine Check interrupt 194 

zero 87 

denormalization 88 

rounding 90 

denormalized number 87 

sign 88 

direct-store segment 173 

single-precision 89 

double-precision 89 

Floating-Point Assist interrupt 199 

doublewords 4 

Floating-Point Unavailable interrupt 198 

DR 149 

FP 148 

DS field 11 

FPCC 85 

DS-form 9 

FPR 84 

DSISR 151 

FPRF 85 

alignment interrupt 275 

FPSCR 84 

dual cache 123 

1-1 

C 85 

FE 85 

FEX 84 

0 

FG 85 

FI 85 

E (Enable bit) 267 

FL 85 

EA 15 

FPCC 85 

EAR 267 

FPRF 85 

eciwx 268 

FR 85 

ecowx 268 

FU 85 

EE 148 

FX 84 

effective address 15, 155, 159 

Nl 86 

32-bit 169 

OE 86 

64-bit 161 

OX 84 

eieio 125, 135 

RN 86 

EQ 18 

UE 86 

exception (def) 143 

UX 85 

execution synchronization 145 

VE 85 

External interrupt 196 

m 

VX 84 

VXCVI 85 

VXIDI 85 

0 

FE 18, 85 

VXIMZ 85 

VXISI 85 

VXSNAN 85 

FEO 148 

VXSOFT 85 

FE1 148 

VXSORT 85 

FEX 84 

VXVC 85 

FG 18, 85 

VXZDZ 85 

FI 85 

XE 86 

FL 18, 85 

XX 85 

ZE 86 


290 PowerPC Architecture First Edition 




FPSCR (continued) 

instruction (continued) 


ZX 85 

fields (continued) 


FR 85 

FRS 11 


FRA field 11 

FRT 11 


i FRB field 11 

FXM 11 


FRC field 11 

L 11 


FRS field 11 

LI 11 


FRT field 11 

LK 11 


FU 18, 85 

MB 11 


FX 84 

ME 11 


FXM field 11 

NB 11 



OE 11 


1 - 1 

RA 11 


G 

RB 11 


1 _ 1 

Rc 11 


GPR 27 

RS 12 


GT 18 

RT 12 


Guarded storage 157, 177 

SH 12 


Gulliver's Travels 233 

SI 12 



SPR 12, 144 


f—1 

SR 12, 144 


H 

TBR 12 


1 _ 1 

TO 12 


halfwords 4 

U 


hardware (def) 143 

Ul 12 


hardware description language 5 

XO 12 


hashed page table 165, 171 

formats 8, 9, 10, 144 


search 166, 172 

A-form 10 


HTAB 165, 171 

B-form 8 


search 166, 172 

D-form 9 



DS-form 9 


[71 

l-form 8 


1 1 

M-form 10 


l-form 8 

M D-form 10 


icbi 132 

M DS-form 10 


J ILE 148 

SC-form 8 


illegal instructions 13 

X-form 9 


inexact 95 

XFL-form 10 


infinity 87 

XFX-form 9 


Inhibited, Caching 177 

XL-form 9 


instruction 

XO-form 10 


fetch 

XS-form 10 


synchronization 269 

instruction cache instructions 

132 

fields 10, 11, 12, 144 

instruction prefetch 157 


AA 10 

Instruction Storage interrupt 

195 

BA 10 

instruction-caused interrupt 

191 

BB 10 

instructions 


BD 11 

classes 12 


BF 11 

dcbf 135 


BFA 11 

dcbi 181 


Bl 11 

dcbst 134 


BO 11 

debt 133 


BT 11 

debtst 133 


D 11 

debz 134 


DS 11 

defined 12 


FLM 11 

forms 13 


FRA 11 

eciwx 268 


FRB 11 

ecowx 268 


FRC 11 

eieio 125, 135 
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instructions (continued) 
icbi 132 
illegal 13 
invalid forms 13 
isync 132 
Idarx 126 
Iwarx 126 
optional 13, 267 
preferred forms 13 
reserved 13 
stdcx. 126 

storage control 131,181 
stwcx. 126 
sync 125 
interrupt (def) 143 
interrupt priorities 201 
interrupt synchronization 191 
interrupt vector 193 
interrupts 

Alignment 196 
Data Storage 194 
Decrementer 198 
External 196 
Floating-Point Assist 199 
Floating-Point Unavailable 198 
Instruction Storage 195 
instruction-caused 191 
Machine Check 194 
new MSR 193 
precise 191 
Program 197 
System Call 198 
System Reset 194 
system-caused 191 
Trace 199 

invalid instruction forms 13 
invalid operation 92 
IP 149 
IR 149 
isync 132 

0 

Kbits 179 
in I BAT 271 
key, storage 179 

0 

L field 11 

language used for instruction operation description 

LE 149 

LI field 11 

Uttle-Endian 233 

LK field 11 

load (def) 119 


LR 18 
LT 18 


M 


M-form 10 

Machine Check interrupt 194 
Machine State Register 
Branch Trace Enable 148 
Data Relocate 149 
External Interrupt Enable 148 
FP Available 148 
FP Exception Mode 148 
Instruction Relocate 149 
Interrupt Little-Endian Mode 148 
Interrupt Prefix 149 
Little-Endian Mode 149 
Machine Check Enable 148 
Power Management Function Enable 148 
Problem State 148 
Recoverable Interrupt 149 
Single-Step Trace Enable 148 
Sixty-Four-bit mode 148 
main storage 119 
MB field 11 
MD-form 10 
MDS-form 10 
ME 148 
ME field 11 

memory coherence 120,155,177 
mismatched WIMG bits 178 
mnemonics 
extended 221 
MSR 148 

0 

NB field 11 

Next Instruction Address 150 

Nl 86 

NIA 6 

no-op 64 

normalization 88 

normalized number 87 

not a number 88 

0 

OE 86 

OE field 11 

optional instruction 13 

OV 27 

overflow 94 

OX 84 
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0 

page fault 156 
page protection 179 
page table 165,171 
search 166, 172 
update 186 

Page Table Entry 165, 171, 178 
POW 148 
PP bits 179 
PR 148 

precise interrupt 191 
preferred instruction forms 13 
prefetch 

instruction 157 
Program interrupt 197 
program order (def) 119 
PTE 165, 171 
PVR 149 

0 

quadwords 4 

0 

RA field 11 

RB field 11 

RC bits 178 

Rc field 11 

real address 158, 159 

reference and change recording 178 

Reference bit 178, 181, 186, 270 

register transfer level language 5 

registers 

Address Space Register 161 
Condition Register 17 
Count Register 18 
Data Address Register 151, 195, 197 
Data Storage Interrupt Status Register 151 
Decrementer 204 
External Access Register 267 
Fixed-Point Exception Register 27 
Floating-Point Registers 84 
Floating-Point Status and Control Register 84 
General Purpose Registers 27 
implementation-specific 273 
Link Register 18 
Machine State Register 148 
Machine Status Save 
Restore Register 0 147 

Restore Register 1 147 

optional 267 

Processor Version Register 149 
SDR1 165, 171 
Segment Registers 269 


registers (continued) 
SPRGn 151 
SPRs 151, 269, 273 
SRRO 147 
SRR1 147 

status and control 269 
Time Base 137, 203 
reserved field 5, 144 
reserved instructions 13 
Rl 149 

RID (Resource ID) 267 
RN 86 
rounding 90 
RS field 12 
RT field 12 
RTL 5, 144 


0 

SC-form 8 
SDR1 165, 171 
SE 148 
segment 

direct-store 159, 173 
ordinary 159 

segment lookaside buffer 163 
Segment Registers 269 
segment table 162 
search 162 
update 186 

Segment Table Entry 162 
SF 148 
SH field 12 
SI field 12 
sign 88 

single-copy atomicity 120 
single-precision 89 
Single-Step Trace 199 
SLB 163 
SO 18, 27 
software 

synchronization 

requirements 270 
speculative operations 157 
split cache 123 
split field notation 8 
SPR field 12, 144 
SPRGn 151 
SPRs 151,269 
SR field 12, 144 
SRRO 147 
SRR1 147 
STAB 162 
search 162 

status and control registers 269 
STE 162 
storage 
access 

synchronization 269 




storage (continued) 
access atomicity 130 
access order 125, 130 
atomic operation 126 
coherence 120 
consistency 155 
Guarded 157 
instruction restart 130 
order 125 

ordering 125, 135, 155 
reservation 127 
segments 155 
shared 125 
weak ordering 155 
storage access 
definitions 
load 119 

program order 119 
store 119 
floating-point 99 
storage access modes 
defined 176 
supported 177 
storage address 14 
storage control 
instructions 181 
storage cons ol instructions 131 
storage key 179 
storage model 155 
storage operations 
speculative 157 
storage protection 179 
storage, Guarded 177 
store (def) 119 
Swift, Jonathan 233 
symbols 221 
sync 125 

sync exceptions 186 
synchronization 144, 186, 269 
context 145 
execution 145 
interrupts 191 
requirements 270 
System Call interrupt 198 
System Reset interrupt 194 
system-caused interrupt 191 

0 

table update 186 
TB 137, 203 
TBL 137, 203 
TBR field 12 
TBU 137, 203 
Time Base 137, 203 
TLB 166, 172 
TO field 12 


Trace interrupt 199 

translation lookaside buffer 166, 172 

trap interrupt (def) 143 

0 

U field 12 
UE 86 
Ul field 12 
undefined 

boundedly 12 
underflow 94 
UX 85 

0 

VE 85 

virtual address 159, 161, 164, 169, 170 

virtual storage 128 

VX 84 

VXCVI 85 

VXIDI 85 

VXIMZ 85 

VXISI 85 

VXSNAN 85 

VXSOFT 85 

VXSORT 85 

VXVC 85 

VXZDZ 85 



WIMG bits 158, 173, 177 
words 4 

Write Through 155,177 
write through cache 124 


0 


X-form 9 

i 

XE 86 


XER 27 


XFL-form 

10 

XFX-form 

9 

XL-form 

9 

XO field 

12 

XO-form 

10 

XS-form 

10 

XX 85 



0 
ZE 86 
zero 87 
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zero divide 94 
ZX 85 
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